With your help, I think I’ve come up with this new plan which will allow me to keep the Country/Region/City relation structure that already exists (with your global city collection plan, I don’t think I would have knowledge of the Country/Region data for each city). It should
I’ll do this process for every Region in the CityStrides database:
Get all the Cities within the Region
For example, retrieve all Cities within Massachusetts, United States:
Iterate over the elements array, using the ID to collect all of the streets in each city
For example, retrieve all the Streets (with their Nodes) within Holyoke, MA United States:
["highway" !~ "path"]
["highway" !~ "steps"]
["highway" !~ "motorway"]
["highway" !~ "motorway_link"]
["highway" !~ "raceway"]
["highway" !~ "bridleway"]
["highway" !~ "proposed"]
["highway" !~ "construction"]
["highway" !~ "elevator"]
["highway" !~ "bus_guideway"]
["highway" !~ "footway"]
["foot" !~ "no"]
["access" !~ "private"]
["access" !~ "no"];
Iterate over all of the "way" items items in the elements array to create the Street/Node records
I think for now I’ll stick with my current data structure (condensing down all of OSM’s ‘segments’ into single Street records within CityStrides), then I’ll find/create the Street by the
name value for each item, and create each Node in the
nodes array (storing its index in the array as an
osm_order field of some sort).
I may opt to expand that
osm_order field by including both the way ID as well as the node index e.g.
8766660-2, etc. This would allow me to draw Streets on the map as lines.
I do expect that I’ll need to stand up my own Overpass server to do all this work against, so that I can avoid rate limiting. I am worried about the build time there - based on the docs, it looks like it could take 48 hours to start up an Overpass server.
I think that I can do the whole process in parallel across Regions. Maybe have 10-20 processes running at once, each working on their own Region.
Update: I’ve just realized that I need a geojson copy of the city border, in order to display that on the map. It looks like I can use
out geom; to include the lat/lon values that comprise each City’s border. Then I’d have to build the geojson by hand…
This example for all the cities in Massachusetts
The next hurdle is how to continue doing this (monthly/quarterly/yearly?) to keep everything up to date. I think that depends on how long the process above takes…
My thought at the moment is to have this whole process writing to a brand new database, which I can swap out in the live site. This would also mean that I’d need to reprocess all activities against the new database - so I might also extract that out to its own database that also gets swapped out in the live site.
- New database for Country/Region/City/Street/Node data & new database for completion data
- Run the above process to completely populate Country/Region/City/Street/Node data
- Reprocess activities into the new completion database
- Swap out both databases from old -> new for the website