spellbook

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs

citibike-methodology.md (2519B)


      1 # Citibike topography methodology
      2 
      3 This document outlines the methodology and data sources used in _[Visualizing the topography of Citibike](https://www.stevegattuso.me/2021/11/28/citibike-topography.html)_
      4 
      5 ## Data Sources
      6 * Used [this](https://gist.github.com/stevenleeg/c9815da685ea0736f77557032b222d48) Python script to download all citibike stations
      7 * Used [NYC Neighborhood Tabluation Area](https://www1.nyc.gov/site/planning/data-maps/open-data/census-download-metadata.page?tab=2) geography files.
      8 * Used [MapPLUTO](https://www1.nyc.gov/site/planning/data-maps/open-data/dwn-pluto-mappluto.page) data for household calculations.
      9 * Fetched 2015-2019 ACS data and census tract geographies from the National Historical Geographic Information System [data portal](https://www.nhgis.org/).
     10 
     11 ## Steps
     12 1. Areas within 0.5km of a Citibike station
     13 	* Imported Citibike station CSV from Python script
     14 	* Reproject to New York/Long Island CRS
     15 	* Create a buffer of 0.5km around each station
     16 	* Dissolve all buffers into a single polygon
     17 	* Clip the polygon using the NTA polygons
     18 2. Households served (within the 0.5km range)
     19 	* Imported MapPLUTO data
     20 	* Clip using the 0.5km station buffers
     21 	* Ran the `Basic statistics` operation on...
     22 		* Unclipped MapPLUTO data to get the total number of households
     23 		* Clipped MapPLUTO data to get the total number of households within 0.5km of a station
     24 	* Calculated percentages based on these values
     25 3. Neighborhood station capacity
     26 	* Imported Citibike station CSV from Python script
     27 	* Ran `Join attributes by location (summary)` operation
     28 		* Summed up `capacity` column of each station per neighborhood
     29 	* Created a new column: `capacity_count / ($area * 100)` to generate `capacity_per_100sqkm`
     30 	* Visualized the column onto the NTA map
     31 4. Neighborhood station capacity in NTAs below the poverty line
     32 	* Fetched and imported census tract geographies sourced from NHGIS
     33 	* Fetched and joined NHGIS 2015-2019 ACS median income per-household data
     34 	* Used NTA geography files
     35 	* Generate centroids of each polygon
     36 	* Run `Join attributes by location (summary)` operation to merge ACS data into NTA polygons
     37 		* Used the median of the median income field
     38 	* Filtered out NTAs below the poverty line of $35k
     39 	* Ran `Join attributes by location (summary)` to merge station data with NTA polygons
     40 		* Summed up `capacity` column
     41 	* Created a new column: `capacity_count / $area * 100` to generate `capacity_per_100sqkm`
     42 	* Visualized the column onto map along with station locations