Now Testing: City street/node updates

:tada:
I’ve got some code that will update CityStrides data from OpenStreetMap. Your edits in OSM will soon make their way into CityStrides!

I’m testing it out on cities one by one, so I can see how long it takes & how many resources it consumes.

This code also solves the issue of Some Streets contain Nodes outside of City boundary

This code does not (yet) handle updates to the city border itself.

Update: There’s a bug in my post-processing, which caused people to reach thousands% complete for some streets. :grimacing: :sweat_smile: The cleanup for this is to completely remove all progress in these cities & rebuild it. Luckily, this only takes a few minutes per city & I’m going one at a time so there’s as short of a lag as possible between the progress being wiped & rebuilt.

Update: OK, I’ve cleaned up the hundreds of millions of bad records. :scream: I’m just about ready to resume tests.

Update 2020-09-14: I’ve got a decent system planned out, but I’m having trouble with excessive IO (from reprocessing users after updating cities) that sometimes causes downtime. I’m slowly testing groups of cities to see how many I can update at once, how much delay I need between each group, etc. This initial test is probably going to be more difficult than regular updates will be, once I get things running continuously - I’m updating cities after a year of OSM changes. :sweat_smile: There should be way less to do once I get this to a point where I’m updating cities weekly (my goal; unsure if I’ll reach it).

Update 2020-09-18: I’ve just started the automatic system & I’ll be monitoring to make sure it doesn’t overload things. It’ll update 10 cities at a time, and it checks minutely to see if it can queue up more cities. It only updates active cities (places with at least one person running there), so that’ll help speed up the frequency of updates. It also tags the ‘last sync’ date, so it’ll re-update in a set order. Right now there are 51,162 cities left in the first update pass (applying all updates in OSM sync 2019).

Update 2020-09-28: 26,068 cities left to update … the first run of this update created some streets with 0 nodes; I think I’ve fixed that, so these 26k cities shouldn’t have that issue. The next pass through of the update code will correct the other cities. I don’t have a good view of throughput yet, I’ve seen numbers from 4 cities per minute up to 16 cities per minute. So I think it might be able to update all cities every 3-9 days. Weekly is nice, I could live with that, but I’m aiming to get it down to every other day - that would be great. :star_struck:

Update 2020-09-29: 25,178 cities left to update … I’ve had a few in this recent batch take double-digit hours to complete.

Update 2020-09-30: 23,589 cities left to update … looks like rate is about one update per month, so far.

Update 2020-10-02: 17,655 cities left to update … I made some improvements that speed things up a bit, but I’m still wildly far off from my every other day goal.

26 Likes

Very cool @JamesChevalier, can you tell us which cities?

2 Likes

NO that’s very secret!
:laughing:
Just kidding

I updated a few in Ohio, working with a fellow Strider, and I updated my hometown. Right now, it’s updating Vancouver.

Toss city links in here and I’ll prioritize you. Ultimately, though, it’s going to be fully automatic.

5 Likes

Awesome, thanks!

1 Like

Going to jump in here and request Greater London: Greater London, England - CityStrides (and the various Boroughs nested therein!) :smiley:

3 Likes

Cambridge MA and Somerville MA please. I’ve made a lot of OSM edits in those. :slight_smile:

3 Likes

Durham NC please! And thank you

1 Like

This is awesome! Excellent work James, now, will this be a one time sync or have you figured out a monthly or quarterly sync too?

Greensboro NC Greensboro, North Carolina - CityStrides

1 Like

Yesterday before the update, I was at 99.3%, now I’m at 98.4% for Vancouver, BC. A few new streets have been added, one of which I was running yesterday anyway, so got credit for it and the first to complete it.

The one issue, is that a lot of city squares, some containing buildings or other structures, so you can’t hit all the nodes and as well, some of those I’ve run, but not been given credit for. Is this just a timing issue? Will the system reprocess these?

Also, one of our viaducts that I’ve run numerous times is showing up in the middle of the water, nowhere near where it belongs.

Here are two showing in the wrong place:

Showing of the coast of Gabon, Africa?

Ah, no nodes (top left on that page).
:man_shrugging:
I’ll need to research this a bit to figure out how it was included.

Those should be edited in OSM. It sounds like they’re not tagged properly.

Yeah, there’s a whole reprocessing that has to happen for all newly created nodes. This first round of updates is going to create/delete a lot of data, so it’ll take some time to update.

Requesting updates for Fair Lawn, NJ please. Some of the nodes still include non-runnable highways too. Also, there have been new streets added to the town (new construction from the last year or so), so those should be added as well. Please and thank you!!

1 Like

A post was split to a new topic: Site availability issues today?

My two targets (in order of preference):

I’d be able to check various areas of OSM updated Keller. Granbury, not so much, as I don’t really remember my OSM edits, and there weren’t that many.

EDIT: But Granbury is fewer streets.

1 Like

@JamesChevalier Couple of questions on how you plan to implement the new OSM data. I understand you’re still beta testing with some cities before a wider release so the final roadmap/details may still be undecided.

  1. Is the ingest of new OSM data going to be on a schedule? ie. each day/week/month new changes will be brought in

  2. Any alerts triggered for users that had previously completed a street but new data indicates they are missing some nodes? Ex: Gus says he’s run every street of Runville, Liechtenstein. He brags about it in every pub he enters. In fact he won’t shut up about it. New OSM data brought in some new nodes & streets. Now Gus is at 99.8% completion rate. Gus wouldn’t know this unless he is still a consistent CS user. Email/fax alert to Gus to let him know to put down his beer and lace up his shoes to get the new nodes/streets? As an aside, no one likes Gus so maybe we don’t tell him ¯_(ツ)_/¯

Sidenote: Probably related to this new ingest/re-processing, streets that I know I have progressed on no longer show a percentage. In addition, the purple bar that was filled based on percentage of completed nodes actually overruns underneath map by a smidge.

1 Like

There are two cities I’m concerned with.

First, I’m running my home town of Deland, Florida, USA - DeLand, Florida - CityStrides

Second, I’m walking at lunch my workplace, Daytona Beach, Florida, USA - Daytona Beach, Florida - CityStrides

Both I’ve made corrections to in OSM, and both include nodes outside of the city. Deland will be more fun because there are pockets inside the city that aren’t in the city. This should be a bit more challenging for your code.

(I’m still making at least one correction a week to OSM, so you’ll end up updating these cities regularly.)

At any rate, thanks for the update.

One note on what I’ve seen in Vancouver: my latest run seems to have analysed every street I touched as “completed”. Some were completed on other runs and some are very clearly not completed (but the % complete is high, like 2371%). Hard to tell but I’m hopeful this is related to the new data.

Glenn Road and Cooney St in Cambridge MA are other zero-node street that is mapped off the west coast of Africa. (Looks like 0/0 lat/longitude!) Sedgewick Rd is a street I definitely have a path across, but its only node is showing as incomplete.

Last piece of info. It’s definitely the case that all my newly “incomplete” streets are at the borders between cities. In some cases I’ve actually traversed already, and in some cases it’s the zero-node issue.

(In case it’s helpful to have more instances.)

Thanks for spotting this strange percentage display @Marty & @35757131ebc41e50cb25

I think I found the problematic code. :male_detective: It lines up with a sudden increase in the size of the database (200gb+ increase within a week, where I’m usually seeing something closer to 10gb/day). :sweat_smile:
:thinking: I’m actually not sure how this hasn’t been already happening. It might just be the fact that reprocessing the entire city makes it much more noticeable.

I’m working on it & I’ll update when I have more info.

1 Like