About persistent route id codes in GTFS

by multimob — written on 2024-03-18


Static GTFS data on one side. OSM data on the other side. How do we reconcile them?

Easy solution: if they have the same route number it must be the same route. Well, maybe, but there are currently 1,066 routes in the GTFS for De Lijn alone. De Lijn operatos 16 bus routes named "Bus 1", 14 routes named "Bus 2", 7 routes named "Bus 20" an d so on. We need more criteria to match them.

We started writing a small script that parses elements about a route. For De Lijn, we can identify the province where a bus route has the mosts stops + some elements about the route colour or the destination. Those can be set in OSM tags.

The script will then populate a table. We can see that route 76466 in GTFS is in fact "Bus 2" in Oostende, which is relation 3349127 in OSM.

Oh yes, on top of that, some routes appear 2 or 3 times in the same GTFS feed. The reason: GTFS applies over several days, usually 2-3 weeks. If there is a small service change—a stop appears, new frequencies of service—they will often duplicate the route to have several versions of it. We should expect both versions to point to the same route relation in OSM.

Anyway, the script made us happy. It also gave a score about its ability to find a suitable match. For instance, if colours were missing in OSM data, it creates uncertainty about whether we have the correct route. Unless there is one single possible match.

That was the idea: run the script against every existing route master relation in OSM and see if it matches. And if it cannot match anything, we have a hint that OSM data is maybe missing.

What could possibly go wrong?

This could work… unless… GTFS data would change. And GTFS data does change every day.

For reasons beyond our understanding, De Lijn does not keep a persistent route id in its GTFS file. They regenerate new codes regularly, even when the route (apparently) did not change at all.

As a consequence, our table is broken every time we import a new GTFS feed.

Matching with OSM data is long and painful, it requires calling the database all the time, we’d rather avoid doing that.

The road is long…


Permalink: https://blog.multimob.be/zzdequie7p.htm

Back to the index

Screenshots with maps are © OpenStreetMap contributors