SparkAndShine
SparkAndShine

Reputation: 18047

How do I group stops into a parent station in GTFS?

In GTFS (defines public transportation schedules and geographic information), a station (parent_station) contains several stops (stop_id).

I am analyzing Paris GTFS data. All parent_station fields are blank value.

mysql> SELECT DISTINCT parent_station FROM stops;
+----------------+
| parent_station |
+----------------+
|                |
| 0              |
+----------------+

How do I specify parent staions for stops (or group stops into parent station)?

mysql> SELECT * FROM stops LIMIT 10;
+---------+-----------+------------------------------------+-------------------------------------------+-----------+----------+---------------+----------------+
| stop_id | stop_code | stop_name                          | stop_desc                                 | stop_lat  | stop_lon | location_type | parent_station |
+---------+-----------+------------------------------------+-------------------------------------------+-----------+----------+---------------+----------------+
| 1166824 |           | "Olympiades"                       | "91 rue de Tolbiac - 75113"               | 48.826948 | 2.367038 |             0 |                |
| 1166825 |           | "Olympiades"                       | "91 rue de Tolbiac - 75113"               | 48.826948 | 2.367038 |             0 |                |
| 1166826 |           | "Bibliotheque-Francois Mitterrand" | "Face au 62 rue du Chevaleret - 75113"    | 48.829831 | 2.376120 |             0 |                |
| 1166827 |           | "Bibliotheque-Francois Mitterrand" | "Face au 62 rue du Chevaleret - 75113"    | 48.829831 | 2.376120 |             0 |                |
| 1166828 |           | "Cour Saint-Emilion"               | "Cour Chamonard - 75112"                  | 48.833314 | 2.387300 |             0 |                |
| 1166829 |           | "Cour Saint-Emilion"               | "Cour Chamonard - 75112"                  | 48.833314 | 2.387300 |             0 |                |
| 1166830 |           | "Bercy"                            | "Place du Bataillon du Pacifique - 75112" | 48.840543 | 2.379409 |             0 |                |
| 1166831 |           | "Bercy"                            | "Place du Bataillon du Pacifique - 75112" | 48.840543 | 2.379409 |             0 |                |
| 1166832 |           | "Gare de Lyon"                     | "Gare SNCF - 75112"                       | 48.844652 | 2.373108 |             0 |                |
| 1166833 |           | "Gare de Lyon"                     | "Gare SNCF - 75112"                       | 48.844652 | 2.373108 |             0 |                |
+---------+-----------+------------------------------------+-------------------------------------------+-----------+----------+---------------+----------------+

The stop 1166830 and 1166831 should belong to the same parent station for the same longitude and lantitude.

One idea comes into my mind. With a given radius (say r), two stops belong to a same station if their distance (say d) is less than r, i.e., d < r.

Any better ideas?

Upvotes: 0

Views: 730

Answers (1)

vcp
vcp

Reputation: 962

Assuming that you are sure that stop entries are not duplicates but they are stops located inside station, I propose following solution: Find list of different stops with same name and location, then edit to indicate the first stop in the list as a "station" and other remaining stops in the list as stops inside the station.

Reference document will help you to do it. As an example I give you following edited (shown with ^^^^) rows:

| 1166830 |  | "Bercy"| "Place du Bataillon du Pacifique - 75112" | 48.840543 | 2.379409 | 1 | |
                                                                                          ^^^
| 1166831 |  | "Bercy"| "Place du Bataillon du Pacifique - 75112" | 48.840543 | 2.379409 | 0 | 1166830 |
                                                                                               ^^^^^^^

Upvotes: 2

Related Questions