Hunting for duplicate primary keys

by multimob — written on 2024-04-29


Adding route relations to OSM can be easy when all your stops are on the map.

We wrote a simple script that merely calculates the sequence of stops—from GTFS data—and will create a new relation containing all the stops in the correct order. Pick the right OSM object with the corresponding code and you have it. On paper, everything is extremely simple.

The problem is when OSM data contains mistakes. Here is a common mistake: a user creates a bus stop and adds a ref:De_Lijn code to it. This tag contains the unique identifier of the stop. This figure can be obtained from the stop sign or the operator’s website. Then another mapper comes in and adds a new bus stop; if the mapper is not familiar with OSM they might want to duplicate the first node for the sake of convenience. After that, there will be two nodes with the same ref:De_Lijn code. The generator will probably pick the wrong one.

Here is how we fix it. First, run the following Overpass query:

[out:xml][timeout:300][bbox:{{bbox}}];
(
  nw[public_transport=platform][!"train=yes"];
);
(._;>;);
out meta;

Save the file, then run this simple query to easily identify multiple occurrences of the ref:De_Lijn tag. Notice the apostrophe in the grep statement, this is to avoid selecting rows containing route_ref:De_Lijn.

grep "'ref:De_Lijn" tmpfile.osm | sort | uniq -c | sort -n

Your output will look like this:

      1     <tag k='ref:De_Lijn' v='535951' />
      1     <tag k='ref:De_Lijn' v='535952' />
      1     <tag k='ref:De_Lijn' v='535953' />
      1     <tag k='ref:De_Lijn' v='538629' />
      1     <tag k='ref:De_Lijn' v='557822' />
      2     <tag k='ref:De_Lijn' v='204437' />
      2     <tag k='ref:De_Lijn' v='205437' />
      2     <tag k='ref:De_Lijn' v='205439' />
      2     <tag k='ref:De_Lijn' v='211082' />

Inspect OSM data for every stop code where the count is higher than 1 and try to figure out which stop has the correct ref number and which stop must be changed.

This is a task we try to do at least once every month, because new collisions regularly occur, when people add new stops or copy existing data.


Permalink: https://blog.multimob.be/zztkn58use.htm

Back to the index

Screenshots with maps are © OpenStreetMap contributors