Filter out buildings and other non-runnable features

hjkiddk · November 10, 2018, 11:44am

Would be nice if you could see already flagged streets in a city and have the option to either confirm or contest the flag.

supermitch · November 22, 2018, 5:34pm

Similarly, I think that manual flagging should be reviewed. If someone has run a street without manual flagging, it should warn or notify you at least, if you try to manually flag. I know it’s just as easy to fake a GPS segment, but I’m fairly opposed to the concept of manual flagging.

petje · November 26, 2018, 9:14am

I get where you’re going with this. But for instance, yesterday, i tagged several streets as non accessible, cos (parts of) these streets don’t exist anymore! there is a huge highway built at this moment, and lots of streets are simply gone. And in the past some runners did the old road of course when it still existed. So what you are proposing is not always realistic.

And we both know we differ in the citystrides approach. I just want to run all public streets of a specific city, not some irrelevant non-street data that was uploaded to a city. If the data imported as streets would be ‘clean’, this whole discussion wasn’t necessary.

zelonewolf · December 8, 2018, 2:32am

Thanks for all the great work on this. The street list is getting better over time. I’d like to offer a slightly different approach. The community approach is important and we need to have it, but there’s still a lot that can be done safely with automated rules.

Now, we know OpenStreetMap is not perfect and can have bad data. But some elements just simply cannot be confused with runnable features and can safely be removed by algorithm.

If we can set a few filtering rules that we can all agree on, it would eliminate 90 percent of the non-runnable features (at least in the cities that I frequent).

Here’s my proposal:

Eliminate any feature with the tag building=yes. There is no possible chance that a feature rendered as a building on OSM should be treated as a runnable feature.
Eliminate any feature with a waterway key, regardless of value. Lakes, rivers, streams, etc., are not runnable and cannot be confused with roads or paths.
Eliminate all leisure=park features. In fact, you might consider eliminating anything with the leisure key with the possible exception of leisure=track. I have found that the leisure=park feature is used very commonly and is rendered as a park boundary area.
Eliminate any feature with a landuse key. These are also rendered as boundaries and will never be runnable features.

By excluding obvious non-runnable features we can reduce effort but still take a conservative approach that won’t inappropriately remove streets and paths.

JamesChevalier · January 5, 2019, 7:08pm

I am thinking about extracting my data processing code out of my main codebase, in order to share it around. In the meantime, this Gist contains the code that extracts Street and Node data out of OSM files.

There’s a whole World Of Process™ that occurs around these scripts:

collect .poly files for Cities within Regions within a Country
extract .osm.pbf files out of a Region’s .osm file
convert the .osm.pbf file to .osm file
then run the two scripts linked above to create two .json files (one containing Streets and the other containing Nodes)
then there’s another piece of code to import the data from the .json files into the database

However, the piece that you’re discussing - ignoring certain data types within the Region’s .osm file - is all within the two scripts linked at the top of this post.

…Anyway… Follow the two links at the top of that Gist, because the person who wrote the code (I just modified it a little bit in order to collect Street and Node data) goes into more detail on the process.

zelonewolf · January 5, 2019, 10:38pm

Okay, I see what they did. Basically this script pulls down every that has a “name” tag.

I modified this a bit to also exclude any ways that have one of a handful of “definitely not runnable” features.

See Code: https://repl.it/@BrianSperlongan/QuickwittedIndolentNonagon

Sorry, not able to test or run it, but I think you get the idea…

rus.golden · January 9, 2019, 5:09pm

Quick question… is the process to flag the non-runnable features AND mark them manually complete, or just flag them?

I think I mentioned somewhere else, that it would be great if we can flag/complete individual nodes… such as gated communities as one example. The street might be a runnable street, but then end prematurely due to gated community, so don’t want to manually complete the entire street, just the nodes I can’t get to.

zelonewolf · February 7, 2019, 12:45pm

Marking them complete will screw up your percentages. Frankly, I wish the “mark manually complete” option would go away. If it’s in the system and it’s runnable, you should have to run it.

edsheldon · February 7, 2019, 5:20pm

Could something like this be done?

CityStrides users/volunteers could submit a list of all streets, in a predetermined format, for their town and that list would be used as the standard for completing 100%.of the streets in that town. If a user completes all the streets on that list, it would show 100%. If a new street was created due to a new development, a user could submit a new street name to be added to that list and the users would now see a completion of 99%, signaling them that a new street has been created in that town.

Users could continue to run other features, like parks, trails, etc and have them display on their LifeMap but would not count towards % of streets complete.

petje · February 8, 2019, 10:37am

sounds like the prefect solution to me

JamesChevalier · February 15, 2019, 12:34am

The ‘manual completed’ idea came from the worry that someone might be an avid runner, but not using a run tracking service … then starting both run tracking & CityStrides … and being able to tick off all the streets they know they’ve run (maybe they have a particular loop they do all the time).
That, or GPS issues during tracking that manage to miss some Nodes on a run, or something.

Thanks for sharing your perspective on this. I’m imagining what things might look like without that feature. I think the GPS issues problem is real (especially the way that I currently determine if a Node is ‘completed’ or not), but I’m not sure if that occurs frequently enough to really matter. I think the ‘new user with an untracked history’ problem is garbage, upon reconsidering it.

I want to see what everyone thinks about this idea, so I added it to the Ideas category.

zelonewolf · March 1, 2019, 2:17am

@JamesChevalier, what are you using as the input for generating the streets list? Are you working from the offline copy of OSM? I’ve been playing with a subsetting tool called osmfilter, and it looks like it’s possible to filter an offline file in-place which could solve the vast majority of the problem with non-runnable features without having to change any code – the input global OSM file could just be pre-filtered before letting citystrides loose on it.

Is that a viable approach?

Also, this could considerably cut down on processing time if your starting OSM file is much smaller.

qaptainqwerty · March 2, 2019, 11:29am

Re ‘new user with untracked history’, I’m one such user, in a way. While I did start out with a Garmin watch, for a few months/years (?) I switched to some app that rewarded my runs with gift cards but is not recognized by CityStrides. Completely, i.e. I used only one tracking device at a time. My plan is to re-run whatever streets I know I already ran back then. It may be possible to find the old data and somehow import into CityStrides, but it’s more enjoyable just re-doing the physical work.

edsheldon · March 14, 2019, 5:13pm

I’ve thought about this also, but I realized before the advent of the GPS device, most runs were out-and-back or loop courses where the distance was determined by measuring with an odometer from a car or bike. Even if all my years of old data were uploaded, I would have to run those routes again just to access all the missing side streets that I didn’t run. So…it is much easier for a new user to start with a clean slate and enjoy the process.

kimluce99 · March 21, 2019, 11:14pm

I would love that option too–I’m down to the last 2% of my city and have quite a few streets that just have a few nodes I can’t get to because they’re in driveways or behind a fence or something. I agree that it would be nice to just be able to mark the node as non-runnable and not the whole street. Regarding the process of flagging them, I think maybe it’s changed, because when I went to flag one yesterday I no longer had the option of explaining why the street wasn’t runnable. There used to be a box for a comment but I’m not seeing that anymore. I’m guessing maybe because James doesn’t have the time to review them anyway? I know at one point he was deleting all of the non-runnable features manually and it was taking forever!

JamesChevalier · March 21, 2019, 11:38pm

Sorry, still super busy post-house-purchase ( ) … I’ll try to reply with more thoroughness as soon as I get the chance.

There is a new street flagging system. Along with a new “flagged street review system” (for lack of a better term). There is a Flagged Streets link in the menu for subscribers. It’s not available to everyone yet because I’m still working on it (monthly contributors get early access to some stuff).

These changes enable me to make a “street modification system” (again, I need some Marketing Terms ). So in the future we will be able to suggest new streets and suggest changes (delete nodes or move them around).

petje · March 22, 2019, 7:05am

Congrats on the house! just a thought, would it be an idea to use the real source to improve your DB? that is openstreetmap. So when a few nodes are off, or not a public street, one could enter the changes in openstreetmap and you could refresh from that source. So we have a double winn, OSM is improved and you wouldn’t have to manually do too much (when osm downloads is an easy idea)

zelonewolf · March 23, 2019, 10:34pm

How about we start by removing all the non-streets from the database?

JamesChevalier · March 24, 2019, 2:57am

I love your enthusiasm.

What you’re stating is very simple from a non-technical perspective. It is quite difficult to do while also not deleting tons of good data.

I’m doing everything I can to give CityStrides users the best, within my limits.

zelonewolf · March 24, 2019, 11:32am

What you’re stating is very simple from a non-technical perspective. It is quite difficult to do while also not deleting tons of good data.

@JamesChevalier, this idea of accidentally throwing away good data is simply not true. If you were to use the osmfilter utility to strip out all nodes that don’t have a highway= tag, you would eliminate 95%+ of the bad data in the data set. It would also eliminate zero bad data. zero. EVERY path, trail, street, or highway has this tag. I challenge you to come up with a single runnable feature that doesn’t have a highway= tag.

You are wasting your own time and everyone else’s time trying to do human review on every lake, park, and McDonald’s in the data set.

Also, if you pre-filtered the data set, it would make it smaller and cut down on processing time.

I’d be happy to help with coming up with the right command line switches for osmfilter.