Reputation: 560
I'm looking to optimise the following MySQL query:
SELECT
`l`.`id` AS `l__id`, `l`.`latitude` AS `l__latitude`, `l`.`longitude` AS `l__longitude`, `l`.`name` AS `l__name`, `f`.`id` AS `f__id`, `f`.`day` AS `f__day`, `f`.`temperature_low` AS `f__temperature_low`, `f`.`temperature_high` AS `f__temperature_high`, `c`.`id` AS `c__id`, `c`.`name` AS `c__name`
FROM `location` `l`
INNER JOIN `forecast` `f` ON `l`.`id` = `f`.`location_id`
INNER JOIN `condition` `c` ON `f`.`condition_id` = `c`.`id`
WHERE (
`f`.`day` IN ('2012-12-28', '2012-12-29') AND
EXISTS (
SELECT
`f2`.`id` AS `f2__id`
FROM `forecast` `f2`
WHERE (
`f2`.`location_id` = `l`.`id` AND `f2`.`day` = "2012-12-28" AND `f2`.`condition_id` IN (6, 7, 9, 10, 11, 12, 13, 14, 18) AND `f2`.`temperature_high` <= 30 AND `f2`.`temperature_low` >= 0
)
) AND
EXISTS (
SELECT
`f3`.`id` AS `f3__id`
FROM `forecast` `f3`
WHERE (
`f3`.`location_id` = `l`.`id` AND `f3`.`day` = "2012-12-29" AND `f3`.`condition_id` IN (6, 7, 9, 10, 11, 12, 13, 14, 18) AND `f3`.`temperature_high` <= 30 AND `f3`.`temperature_low` >= 0
)
)
AND `l`.`latitude` IS NOT NULL AND `l`.`longitude` IS NOT NULL
);
Running an EXPLAIN gives the following output:
+----+--------------------+-------+--------+--------------------------------------+-----------------+---------+------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+--------+--------------------------------------+-----------------+---------+------------------------+------+----------------------------------------------+
| 1 | PRIMARY | f | range | location_id_idx,condition_id_idx,day | day | 3 | NULL | 1298 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | l | eq_ref | PRIMARY | PRIMARY | 8 | weather.f.location_id | 1 | Using where |
| 1 | PRIMARY | c | eq_ref | PRIMARY | PRIMARY | 8 | weather.f.condition_id | 1 | |
| 3 | DEPENDENT SUBQUERY | f3 | ref | location_id_idx,condition_id_idx,day | location_id_idx | 9 | weather.l.id | 276 | Using where |
| 2 | DEPENDENT SUBQUERY | f2 | ref | location_id_idx,condition_id_idx,day | location_id_idx | 9 | weather.l.id | 276 | Using where |
+----+--------------------+-------+--------+--------------------------------------+-----------------+---------+------------------------+------+----------------------------------------------+
I understand that the problem is "Using where; Using temporary; Using filesort" and the sub queries but i'm unsure how to re-write this for performance.
Thanks in advance for any pointers which can help me with problems like this in the future.
Pete
Upvotes: 0
Views: 117
Reputation:
In your query you are supplying predicates to restrict the forecast
set but are joining your tables out of order. The mysql optimiser does some work and uses the day
key in forecast and restricts the forecast set using f.day IN ('2012-12-28', '2012-12-29')
before joining the other tables. The optimiser is not able to use your dependent subqueries (which look unneccessary) to further restrict the forecast set by condition_id
, temperature_high
and temperature_low
before joining so this is done after all joins have been made. In order to improve the query you should remove your dependent subqueries and restrict the forecast set to begin with.
SELECT *
FROM forecast f
INNER JOIN location l
ON l.id = f.location_id AND l.latitude IS NOT NULL AND l.longitude IS NOT NULL
INNER JOIN condition c
ON f.condition_id = c.id
WHERE f.day IN ('2012-12-28', '2012-12-29')
AND f.condition_id IN (6, 7, 9, 10, 11, 12, 13, 14, 18)
AND f.temperature_high <= 30 AND f.temperature_low >= 0
Upvotes: 1