Identifying outliers reliably

by multimob — written on 2023-11-21

The challenges for Q/A on public transport networks are countless. Here is something that is fairly obvious to the human eye but fairly complicated to check in an automated manner.

A bus route in OSM with a stop in a street where the bus does not run

There is clearly a problem here: either this stop should not have been added, or the itinerary is incorrect.

Trying to match OSM data with GTFS data for the expected sequence of stops can help us figure out whether this is the first or the second situation.

But sadly, this method alone is insufficient to identify such a problem. The only method would be to download all the nodes of the ways from the relation and, for each stop, calculate the distance to the nearest edge. There might be a few false negatives now and then—imagine a long road in a straight line with only two nodes, and the bus stop is near the middle; it is very close to the road but far from the only available edges—but that could be manageable.


Back to the index

Screenshots with maps are © OpenStreetMap contributors