Reputation: 8021
I am still new to SQL and I am trying to improve the performance of my query. I have been searching around and have come to the conclusion that using JOINS instead of so many WHERE INS would help improve my performance, but I am unsure of how I would convert my statement. This is my current statement.
SELECT stop_id, stop_name FROM stops WHERE stop_id IN (
SELECT DISTINCT stop_id FROM stop_times WHERE trip_id IN (
SELECT trip_id from trips WHERE route_id = <routeid> ));
It takes anywhere from 5-25 seconds to return the results which is unacceptable. I was hoping to get it below 1 second. If anyone was wondering the data is from a GTFS feed. The stops and trips tables have about ~10,000 rows each, while the stop_times table has ~900,000. I have created indexes at each of the columns I am using. Here is the output of EXPLAIN, and also what was used to create each table.
Thanks for any help and if you need any more info let me know!
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
| 1 | PRIMARY | stops | ALL | NULL | NULL | NULL | NULL | 6481 | Using where |
| 2 | DEPENDENT SUBQUERY | stop_times | index_subquery | stop_id | stop_id | 63 | func | 63 | Using where |
| 3 | DEPENDENT SUBQUERY | trips | unique_subquery | PRIMARY,route_id | PRIMARY | 62 | func | 1 | Using where |
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
| stops | CREATE TABLE `stops` (
`stop_id` varchar(20) NOT NULL,
`stop_code` varchar(50) DEFAULT NULL,
`stop_name` varchar(255) DEFAULT NULL,
`stop_desc` varchar(255) DEFAULT NULL,
`stop_lat` decimal(8,6) DEFAULT NULL,
`stop_lon` decimal(8,6) DEFAULT NULL,
`zone_id` int(11) DEFAULT NULL,
`stop_url` varchar(255) DEFAULT NULL,
`location_type` int(2) DEFAULT NULL,
`parent_station` int(11) DEFAULT NULL,
`wheelchair_boarding` int(2) DEFAULT NULL,
PRIMARY KEY (`stop_id`),
KEY `zone_id` (`zone_id`),
KEY `stop_lat` (`stop_lat`),
KEY `stop_lon` (`stop_lon`),
KEY `location_type` (`location_type`),
KEY `parent_station` (`parent_station`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
| stop_times | CREATE TABLE `stop_times` (
`trip_id` varchar(20) DEFAULT NULL,
`arrival_time` varchar(8) DEFAULT NULL,
`arrival_time_seconds` int(11) DEFAULT NULL,
`departure_time` varchar(8) DEFAULT NULL,
`departure_time_seconds` int(11) DEFAULT NULL,
`stop_id` varchar(20) DEFAULT NULL,
`stop_sequence` int(11) DEFAULT NULL,
`stop_headsign` varchar(50) DEFAULT NULL,
`pickup_type` int(2) DEFAULT NULL,
`drop_off_type` int(2) DEFAULT NULL,
`shape_dist_traveled` varchar(50) DEFAULT NULL,
KEY `trip_id` (`trip_id`),
KEY `arrival_time_seconds` (`arrival_time_seconds`),
KEY `departure_time_seconds` (`departure_time_seconds`),
KEY `stop_id` (`stop_id`),
KEY `stop_sequence` (`stop_sequence`),
KEY `pickup_type` (`pickup_type`),
KEY `drop_off_type` (`drop_off_type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
| trips | CREATE TABLE `trips` (
`route_id` varchar(20) DEFAULT NULL,
`service_id` varchar(20) DEFAULT NULL,
`trip_id` varchar(20) NOT NULL,
`trip_headsign` varchar(255) DEFAULT NULL,
`trip_short_name` varchar(255) DEFAULT NULL,
`direction_id` tinyint(1) DEFAULT NULL,
`block_id` int(11) DEFAULT NULL,
`shape_id` varchar(50) DEFAULT NULL,
PRIMARY KEY (`trip_id`),
KEY `route_id` (`route_id`),
KEY `service_id` (`service_id`),
KEY `direction_id` (`direction_id`),
KEY `block_id` (`block_id`),
KEY `shape_id` (`shape_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
Upvotes: 0
Views: 502
Reputation: 349
You're right in thinking that JOINS are usually faster than WHERE IN subqueries.
Try this:
SELECT T3.stop_id, T3.stop_name
FROM trips AS T1
JOIN
stop_times AS T2
ON T1.trip_id=T2.trip_id AND route_id = <routeid>
JOIN stops AS T3
ON T2.stop_id=T3.stop_id
GROUP BY T3.stop_id, T3.stop_name
Upvotes: 3