Reputation: 17
I am using a GTFS relational database to find the fastest trip time offered between two given points. The fields I have are trip_id
, arrival_time
and departure_time
(in minutes from midnight), and stop_id
A tranche of the data might be:
trip_id | arrival_time | departure_time | stop_id
1 | 5 | 5 | 90001
1 | 8 | 8 | 90002
1 | 10 | 10 | 90003
1 | 15 | 15 | 90004
2 | 25 | 25 | 90001
2 | 28 | 28 | 90002
2 | 32 | 33 | 90003
2 | 38 | 38 | 90004
3 | 35 | 35 | 90001
3 | 38 | 38 | 90002
3 | 48 | 48 | 90004
4 | 8 | 8 | 90003
4 | 10 | 10 | 90004
4 | 15 | 15 | 90005
I am looking for the shortest time (and associated trip) between the departure from stop 90001 and the arrival at stop 90003. The answer is 5 minutes for trip '1' which beats the 8 minutes for trip 2 and the fact that trip 3 doesn't stop at 90003. Also, the time differences have to be on the same trip the output can't be 3 minutes from a combination of trip 1 and 4.
How do I write an SQL statement to do this? So far I have failed miserably:
SELECT trip_id, arrival_time, departure_time, stop_id, MIN(departure_time-arrival_time)
FROM stop_times
WHERE stop_id IN (90001, 90003)
GROUP BY trip_id
(I have already subset the data so all trips are in the direction I want so I will not have to flip the arrival and departure on the fly. It will always be the time between the departure_time
of the first stop_id
and the arrival_time
of the second.)
Upvotes: 0
Views: 216
Reputation: 62861
I think this should work using max
and case
with limit
:
select trip_id,
max(case when stop_id = 90003 then arrival_time end)
- max(case when stop_id = 90001 then departure_time end)
from stop_times
where stop_id in (90001,90003)
group by trip_id
order by 1
limit 1
I will note that for trip_id 2
, this returns 7 instead of 8 (32-25). But perhaps it should use the departure_time
for both case
statements.
Upvotes: 1