gren
gren

Reputation: 17

Shortest time and trip between two stops across many trip options (GTFS)

I am using a GTFS relational database to find the fastest trip time offered between two given points. The fields I have are trip_id, arrival_time and departure_time (in minutes from midnight), and stop_id

A tranche of the data might be:

trip_id | arrival_time | departure_time | stop_id
1       | 5            | 5              | 90001
1       | 8            | 8              | 90002
1       | 10           | 10             | 90003
1       | 15           | 15             | 90004
2       | 25           | 25             | 90001
2       | 28           | 28             | 90002
2       | 32           | 33             | 90003
2       | 38           | 38             | 90004
3       | 35           | 35             | 90001
3       | 38           | 38             | 90002
3       | 48           | 48             | 90004
4       | 8            | 8              | 90003
4       | 10           | 10             | 90004
4       | 15           | 15             | 90005

I am looking for the shortest time (and associated trip) between the departure from stop 90001 and the arrival at stop 90003. The answer is 5 minutes for trip '1' which beats the 8 minutes for trip 2 and the fact that trip 3 doesn't stop at 90003. Also, the time differences have to be on the same trip the output can't be 3 minutes from a combination of trip 1 and 4.

How do I write an SQL statement to do this? So far I have failed miserably:

SELECT trip_id, arrival_time, departure_time, stop_id, MIN(departure_time-arrival_time)
FROM stop_times
WHERE stop_id IN (90001, 90003)
GROUP BY trip_id

(I have already subset the data so all trips are in the direction I want so I will not have to flip the arrival and departure on the fly. It will always be the time between the departure_time of the first stop_id and the arrival_time of the second.)

Upvotes: 0

Views: 216

Answers (1)

sgeddes
sgeddes

Reputation: 62861

I think this should work using max and case with limit:

select trip_id, 
  max(case when stop_id = 90003 then arrival_time end)
  - max(case when stop_id = 90001 then departure_time end)
from stop_times
where stop_id in (90001,90003)
group by trip_id
order by 1
limit 1

I will note that for trip_id 2, this returns 7 instead of 8 (32-25). But perhaps it should use the departure_time for both case statements.

Upvotes: 1

Related Questions