"Update this street" feature

I mentioned it somewhere else too, but I’m hoping for an ‘Update this street’ feature. In your query you could use the city area id and ['name' = '{street_name}'] to get the current nodes from openstreetmap.
Then of course the results have to be processed, if there are fewer nodes (which I think will be the majority of cases) this should be straightforward (said the person not having to do it :wink: )
If there are more nodes it will be harder, because activities have to be identified that have to be recalculated, not sure how to do that.

@JamesChevalier Do you think such a feature is at all possible?

I haven’t put too much thought into how the global update process is going to go. I’ve had my mind at the global level, but this street-level approach is interesting.

I’ll need to look at the data that comes back from that kind of query, and see how easy it is to compare the returned Nodes with the Nodes that I have in my database. I do store the OSM ID for each Node, so that could help. I know that OSM IDs aren’t static, but I don’t know if updates can change the IDs or if it’s just delete/recreate gives a new ID.

I would strongly recommend an overpass diff query. It will give you all of the way and node deltas between two dates and it’s a relatively light query for a city sized area.

From the Overpass QL docs, it looks like I’d use [diff:"2019-09-01T01:00:00Z"] (or whenever I started this project) in my query. I’m having a tough time figuring out where it goes in the query, though.

The closest I’ve come is adding it to the first line: [timeout:900][out:json][diff:"2019-09-01T01:00:00Z"]; in my street query which produced the error “The selected output format does not support the diff or adiff mode.”

It sounds like I’ll need to change more of the query to work with diffs, but I don’t have time to dive into that right now.

You have the syntax correct. Just use the same query you used to generate the original and add the diff clause to the front.

The only issue is that diff ONLY works for xml output. NOT json. So you’ll have to translate it, but it will be the information you need.

1 Like

Oops!

runtime error: Tried to use museum file but no museum files available on this instance.
runtime error: open64: 2 No such file or directory /location/of/my/osm-3s_v0.7.55/db/ways_attic.bin File_Blocks::File_Blocks::1

Looks like I didn’t build my Overpass server to include attic data. :man_facepalming:

After running my query against //overpass-api.de/api/ I can see how that approach can be way more efficient. I’ll still need to try a few queries out to see what kind of action sets there are. My query only returned a single node addition to a single street. I worry about how complicated these data sets are going to be. :grimacing:

1 Like

Regarding the global update process I would consider making this a continuous gradual process, rather than a big bang. Each day refresh a few cities, depending on resources, next day move down the list of cities etc. You could use this to prioritize certain cities, based on activities, number of runs etc. After all, cities without any runs don’t really need to be updated.
This will give you a chance to finetune the process, and in case of any issues there is less to fix and fewer users affected.

Yeah - an ongoing effort, with a focus that pairs both A) the most active cities in CityStrides and B) the most actively updated cities in OSM probably makes the most sense.

One concern that I have is that by bringing this closer to daily (from monthly/quarterly) starts to shift a bunch of my ongoing focus from development and into infrastructure - unless I can get it right the first time.

  • I have to build the correct Overpass queries, and handle the XML response properly
  • I have to rebuild my Overpass server to include attic data, so that I can actually run these queries
  • I have to build out an infrastructure that can handle whatever load is required to do this work
  • I have to set up the system that automates this effort

My current perspective is that the Overpass query & response handling will be the biggest effort. The OSM & CityStrides data aren’t 1:1 matches, so I’ll need to figure out how to handle the changes in a way that makes sense within CityStrides.
I don’t think there’s a ton of effort in the Overpass rebuild or the infrastructure setup, but there’s definitely some costs involved that I’ll need to keep a hold of.
The automation effort seems like it’ll be the second biggest effort, mainly around keeping a balance between good update progress and keeping server load in check.

I don’t pretend to know what is best from a programming infrastructure standpoint, but I agree with the comment above that 2-4 times a year is plenty often enough for overall updates.

I don’t expect, with the volume of changes that likely take place to OSM data, for every modification to need to initiate a change.