Reputation: 12384
I'm using the data from Rennes (2nd zip file from that page), and this tables schema.
Here's my first query, it lists the stops from a route (from the first trip of that route):
select
first_trip_of_route.trip_id,
st.stop_id,
s.stop_name
from (
select
t.trip_id,
r.route_long_name
from routes r
left join trips t on t.route_id = r.route_id
where r.route_id = '0033'
limit 1
) as first_trip_of_route
left join stop_times st on st.trip_id = first_trip_of_route.trip_id
left join stops s on s.stop_id = st.stop_id
order by st.stop_sequence;
It works, here are the stops of the bus line 33 in Rennes:
+---------+---------+----------------------+
| trip_id | stop_id | stop_name |
+---------+---------+----------------------+
| 2420 | 2220 | Gautrais |
| 2420 | 2221 | Rossel |
| 2420 | 2234 | Pommerais |
| 2420 | 2223 | Abbé Grimault |
| 2420 | 2232 | Morinais |
| 2420 | 2202 | Collège Jean Moulin |
| 2420 | 2214 | Médiathèque |
| 2420 | 2204 | Jean Marin |
| 2420 | 2263 | Jean Jaurès |
| 2420 | 2205 | Blosne |
| 2420 | 2225 | Gaité |
| 2420 | 2230 | Rablais Allende |
| 2420 | 2227 | Croix Verte |
| 2420 | 2271 | 25 Fusillés |
| 2420 | 1454 | Bréquigny Piscine |
| 2420 | 1455 | Lycée Bréquigny |
| 2420 | 1456 | Coubertin |
| 2420 | 1457 | Norvège |
| 2420 | 1130 | Canada |
| 2420 | 1623 | Alma |
| 2420 | 1459 | Henri Fréville |
| 2420 | 1460 | Argonautes |
| 2420 | 1461 | Clemenceau |
| 2420 | 1462 | Combes |
| 2420 | 1464 | Binquenais |
| 2420 | 1463 | Binquenais Collège |
| 2420 | 1465 | Triangle |
| 2420 | 1353 | Torigné |
| 2420 | 1466 | Hôpital Sud |
| 2420 | 1467 | Le Blosne |
| 2420 | 1356 | Galicie |
| 2420 | 1468 | La Poterie |
| 2420 | 3020 | Val Blanc |
| 2420 | 3021 | Rocade Sud |
| 2420 | 3008 | Loges |
| 2420 | 3009 | Chantepie Mairie |
| 2420 | 3010 | Chantepie Eglise |
| 2420 | 3022 | Hallouvry |
| 2420 | 3017 | IDEFS |
| 2420 | 3016 | Cucé |
+---------+---------+----------------------+
Now I want to add for each stop, the available routes from that stop.
First I connected the stop_id
s, but unfortunately Rennes decided that if the stop is not the very same building, it's not the same stop, even if it's 10 meters across. Actually that makes sense, but it's not making our life easier here :)
So I tried to connect on the stop name. Here's an example for Alma
:
mysql> select stop_id, stop_name from stops where stop_name = 'Alma';
+---------+-----------+
| stop_id | stop_name |
+---------+-----------+
| 1622 | Alma |
| 1623 | Alma |
+---------+-----------+
2 rows in set (0.04 sec)
Cool. How about finding the routes available at that stop?
mysql> select r2.route_id as route_id,
s2.stop_name as stop_name
from stops s2
left join stop_times st2 on st2.stop_id = s2.stop_id
left join trips t2 on t2.trip_id = st2.trip_id
left join routes r2 on r2.route_id = t2.route_id
where s2.stop_name = 'Alma'
group by r2.route_id;
+----------+-----------+
| route_id | stop_name |
+----------+-----------+
| 0003 | Alma |
| 0033 | Alma |
+----------+-----------+
2 rows in set (0.13 sec)
Great. When we're in Alma
, we can get on the bus 3 or 33.
Now let's mix the two queries together:
select
first_trip_of_route.trip_id,
st.stop_id,
s.stop_name,
connections.route_id
from (
select
t.trip_id,
r.route_long_name
from routes r
left join trips t on t.route_id = r.route_id
where r.route_id = '0033'
limit 1
) as first_trip_of_route
left join stop_times st on st.trip_id = first_trip_of_route.trip_id
left join stops s on s.stop_id = st.stop_id
left join (
select
r2.route_id as route_id, s2.stop_name as stop_name
from stops s2
left join stop_times st2 on st2.stop_id = s2.stop_id
left join trips t2 on t2.trip_id = st2.trip_id
left join routes r2 on r2.route_id = t2.route_id
group by r2.route_id
) connections
on connections.stop_name = s.stop_name
order by st.stop_sequence
It works for most stops, but as you can see it says there's no connection in Alma:
+---------+---------+----------------------+----------+
| trip_id | stop_id | stop_name | route_id |
+---------+---------+----------------------+----------+
| 2420 | 2220 | Gautrais | NULL |
| 2420 | 2221 | Rossel | NULL |
| 2420 | 2234 | Pommerais | NULL |
| 2420 | 2223 | Abbé Grimault | NULL |
| 2420 | 2232 | Morinais | NULL |
| 2420 | 2202 | Collège Jean Moulin | NULL |
| 2420 | 2214 | Médiathèque | NULL |
| 2420 | 2204 | Jean Marin | NULL |
| 2420 | 2263 | Jean Jaurès | NULL |
| 2420 | 2205 | Blosne | NULL |
| 2420 | 2225 | Gaité | NULL |
| 2420 | 2230 | Rablais Allende | NULL |
| 2420 | 2227 | Croix Verte | NULL |
| 2420 | 2271 | 25 Fusillés | NULL |
| 2420 | 1454 | Bréquigny Piscine | NULL |
| 2420 | 1455 | Lycée Bréquigny | NULL |
| 2420 | 1456 | Coubertin | 0213 |
| 2420 | 1456 | Coubertin | 0212 |
| 2420 | 1457 | Norvège | NULL |
| 2420 | 1130 | Canada | 0033 |
| 2420 | 1623 | Alma | NULL | <<< WTF?
| 2420 | 1459 | Henri Fréville | 0037 |
| 2420 | 1459 | Henri Fréville | 0159 |
| 2420 | 1459 | Henri Fréville | 0074 |
| 2420 | 1459 | Henri Fréville | 0172 |
| 2420 | 1459 | Henri Fréville | 0079 |
| 2420 | 1460 | Argonautes | NULL |
| 2420 | 1461 | Clemenceau | NULL |
| 2420 | 1462 | Combes | NULL |
| 2420 | 1464 | Binquenais | NULL |
| 2420 | 1463 | Binquenais Collège | NULL |
| 2420 | 1465 | Triangle | 0061 |
| 2420 | 1465 | Triangle | 0161 |
| 2420 | 1353 | Torigné | NULL |
| 2420 | 1466 | Hôpital Sud | NULL |
| 2420 | 1467 | Le Blosne | NULL |
| 2420 | 1356 | Galicie | NULL |
| 2420 | 1468 | La Poterie | 0214 |
| 2420 | 1468 | La Poterie | 0075 |
| 2420 | 1468 | La Poterie | 0173 |
| 2420 | 1468 | La Poterie | 0073 |
| 2420 | 3020 | Val Blanc | NULL |
| 2420 | 3021 | Rocade Sud | NULL |
| 2420 | 3008 | Loges | NULL |
| 2420 | 3009 | Chantepie Mairie | NULL |
| 2420 | 3010 | Chantepie Eglise | NULL |
| 2420 | 3022 | Hallouvry | NULL |
| 2420 | 3017 | IDEFS | NULL |
| 2420 | 3016 | Cucé | NULL |
+---------+---------+----------------------+----------+
What gives?
Upvotes: 0
Views: 924
Reputation: 12384
OK that wasn't a great question. I ended up not linking the stops by name, but by distance. Here's my final query, if someone is interested...
select
first_trip_of_route.trip_id, first_trip_of_route.route_long_name,
st.stop_id,
st.departure_time,
s.stop_name,
connected_stops.stop_id as connected_stop_id, connected_stops.stop_name as connected_stop_name,
(6371000*acos(cos(radians(s.stop_lat))*cos(radians(connected_stops.stop_lat))*cos(radians(s.stop_lon)-radians(connected_stops.stop_lon))+sin(radians(s.stop_lat))*sin(radians(connected_stops.stop_lat)))) as connected_stop_distance,
connected_routes.route_id as connected_route_id, connected_routes.route_long_name as connected_route_name
-- look for the 1st trip of that route
from (
select
t.trip_id,
r.route_long_name
from routes r
left join trips t on t.route_id = r.route_id
where r.route_id = '0033'
limit 1
) as first_trip_of_route
-- get the list of stops taken by that trip
left join stop_times st on st.trip_id = first_trip_of_route.trip_id
-- add info about the stops (stop name)
left join stops s on s.stop_id = st.stop_id
-- look for stops next to this one (200 meters)
left join (
select stop_id, stop_lat, stop_lon, stop_name
FROM stops connected_stops
) as connected_stops on
(6371000*acos(cos(radians(s.stop_lat))*cos(radians(connected_stops.stop_lat))*cos(radians(s.stop_lon)-radians(connected_stops.stop_lon))+sin(radians(s.stop_lat))*sin(radians(connected_stops.stop_lat))))
< 200
and connected_stops.stop_id <> s.stop_id
-- add all the vehicles that make those stops
left join stop_times connected_stop_times on connected_stop_times.stop_id = connected_stops.stop_id
-- get the trips of those vehicles
left join trips connected_trips on connected_trips.trip_id = connected_stop_times.trip_id
-- get the routes from which these trips belong
left join routes connected_routes on connected_routes.route_id = connected_trips.route_id
-- ensure we get only one connected route per stop 200 meters from the initial searched stop
group by s.stop_id, connected_stop_id, connected_route_id
order by st.stop_sequence;
Here's a limited output:
+---------+----------------------+-------------------+----------------------+-------------------------+--------------------+
| stop_id | stop_name | connected_stop_id | connected_stop_name | connected_stop_distance | connected_route_id |
+---------+----------------------+-------------------+----------------------+-------------------------+--------------------+
... many rows here ...
| 1130 | Canada | 1449 | Norvège | 185.0837104437 | 0804 |
| 1130 | Canada | 1449 | Norvège | 185.0837104437 | 0059 |
| 1130 | Canada | 1449 | Norvège | 185.0837104437 | 0033 |
| 1623 | Alma | 1622 | Alma | 65.261785846695 | 0003 |
| 1623 | Alma | 1622 | Alma | 65.261785846695 | 0033 |
| 1459 | Henri Fréville | 1447 | Henri Fréville | 97.6477727244322 | 0037 |
| 1459 | Henri Fréville | 1447 | Henri Fréville | 97.6477727244322 | 0033 |
... many rows here....
+---------+----------------------+-------------------+----------------------+-------------------------+--------------------+
166 rows in set (0.53 sec)
We can see that stop 1623
'Alma' is indeed connected to stop 1622
'Alma' (65 meters across), with the available routes 0003
and 0033
.
Upvotes: -1