README.md (1625B)
1 # Forerad 2 This repository is a collection of utilities for working with Citibike data. It allows you to easily download all of Citibike's ride history archives, transform them as you see fit, and throw them into a SQLite database for easy querying. 3 4 This repository is what I use to build the SQLite database used in [Citibike Explorer](https://citibike.stevegattuso.me). It is also potentially useful if you don't feel like re-writing your own scraper to download, unzip, and load trip history archives into a `pd.DataFrame`. 5 6 ## Installation and usage 7 Clone the repository, cd into the directory, and run: 8 9 ```bash 10 $ python -m virtualenv .venv 11 $ source .venv/bin/activate 12 $ pip install -r ./requirements.txt 13 ``` 14 15 Once requirements are installed, you can use `./bin/scraper` to download the trip archives individually or all in one swoop. See `./bin/scraper --help` for details. 16 17 There is also `./bin/hourly-volume-rollup` which will parse through all available archives and roll up the trip data into an hourly timeseries. Note that this requires provisioning a sqlite database, which can be done by running `yoyo apply`. 18 19 If you're just looking to load an archive into pandas, here's the code snippet you're looking for: 20 21 ```python 22 import forerad.scrapers.historical as historical 23 24 archives = historical.HistoricalTripArchive.list_cached() 25 df = archives[0].fetch_df() 26 27 print(df) 28 ``` 29 30 ## FAQ 31 ### What's with the stupid name? 32 I originally wanted to build a forecast of daily trip volume but ended up scaling back my ambitions (maybe just for now). `Fore` is for forecast, `rad` is for das Fahrrad, the German word for bike.