Validating on two versions of the same route

by multimob — written on 2024-04-15


This is a follow-up of a previous post about problems with GTFS data showing multiple versions of the same route.

A handy script we added downloaded the tags of a given relation and compared it to the GTFS database. If we have on the number, it is fine. If multiple routes have the same number—a very common occurrence for De Lijn and TEC, that have several hundreds of routes—, we can discriminate and match on the sub-network. Then on the colour, and then on the destination text.

But we might still have a problem when one route in OSM matches multiple concurrent versions of the same route in GTFS.

A further check might be to inspect the calendar. For every (route/stop) combination we can obtain the days of operation. Querying the database on min/max values for those will easily tell us which version of the route is the most recent one. Then, we can match this one and consider it to be the preferred version.

This has to be tested on real data. We have just observed that TEC has released an updated version of a route for an extension. But then we have a new problem.

One version has the following tags, which match OSM data.

description=Namur- Rhisnes - Les Isnes
name=Bus 721
network=TECN
operator=TEC
ref=721

The other version has the following tags. The route will be extended to Velaine.

description=Namur- Rhisnes - Les Isnes - Velaine
name=Bus 721
network=TECN
operator=TEC
ref=721

The script won't work well here. It will normally assume that the first version is the one and only correct one, as it produces a perfect match with OSM data. The script is therefore unable to detect changes it itineraries, and will probably reject this data or treat it as unmatched garbage. The network contains several hundreds routes and OSM data for TEC is still very far from reality, so it will probably be ignored.

It will be detected automatically several weeks later. Once the extension is operated, the first version of the line (ending at Les Isnes) will disappear from GTFS, leaving only the second version. Running the script that day in the future will produce an unmatched route, for the description tag, so that will be an alert. And it will probably be too late.

Getting things done right and proactively will need some extra checks here.


Permalink: https://blog.multimob.be/zzeeth8uop.htm

Back to the index

Screenshots with maps are © OpenStreetMap contributors