Bridge Up Scraper is a Python-based service that monitors and analyzes bridge statuses along the Great Lakes St. Lawrence Seaway. It scrapes real-time data from official websites, processes this information, and stores it in Firebase Firestore. Containerized with Docker, it provides comprehensive insights into bridge operations with automated updates.
This is a hobby project so expect breaking changes. Don't aggressively scrape their website or they will probably block your IP. Also because it completely relies on the St Lawrence Seaway website, if they change their HTML layout or block public access then it will stop working completely 💀 No warranty or guarantees of any kind provided, use at your own risk.
- 👀 Scrapes bridge status information from multiple regions
- 📊 Stores and manages historical activity logs
- 🗓️ Organizes and displays current and upcoming bridge events
- 📈 Calculates statistics from historical data
- 🔥 Utilizes Firebase Firestore for efficient data storage
- 🧹 Automatically cleans up old and irrelevant historical data
- 🐳 Containerized with Docker for easy deployment
- Firebase
- Docker
- Python 3.9+
- pip
-
Clone the repository
-
Add your Firebase credentials
firebase-auth.json
file in the project root: -
Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows, use venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Run the application:
python start_flask.py
You can either build the Docker image yourself below, or download it from Docker Hub by clicking here. The image is updated via a Github Workflow every time a commit it made.
- Build the Docker image:
docker build -t bridge-up-backend .
- Run the Docker container (make sure to provide the path to your Firebase credentials file called
firebase-auth.json
):
docker run -p 5000:5000 /path/on/host/firebase-auth.json:/app/data/firebase-auth.json bridge-up-backend
Bridge URLs and coordinates are configured in config.py
. A couple of the bridges i'm not 100% sure about their location (since I don't live in the area), and I've only got the bridge numbers for St Catharines and Port Colburne. Modify this file or submit a pull request if you know what they should be.
The application uses APScheduler to run tasks and can be managed inside the start_flask.py
and start_waitress.py
files. This interval is pretty aggressive, so you should probably make it a longer interval or risk your IP getting banned.
- 🌞 Scrapes and updates bridge data every 30 seconds from 6:00 AM to 9:59 PM
- 🌙 Scrapes and updates bridge data every 60 seconds from 10:00 PM to 5:59 AM
- 🧮 Runs daily statistics update at 4 AM
scraper.py
: Main script for scraping and processing bridge datastats_calculator.py
: Calculates bridge statisticsstart_flask.py
: Starts the Flask development serverstart_waitress.py
: Starts the Waitress production serverconfig.py
: Configuration for bridge URLs and coordinates
Contributions are welcome. Please submit a pull request or create an issue for any features or bug fixes.
GPL v3: You can do whatever you want as long you give attribution and the software you use it in also has a open license.