Peter Hough
Peter Hough

Reputation: 560

MySQL Query Optimisation - Sub Queries & Using where; Using temporary; Using filesort

I'm looking to optimise the following MySQL query:

SELECT 
    `l`.`id` AS `l__id`, `l`.`latitude` AS `l__latitude`, `l`.`longitude` AS `l__longitude`, `l`.`name` AS `l__name`, `f`.`id` AS `f__id`, `f`.`day` AS `f__day`, `f`.`temperature_low` AS `f__temperature_low`, `f`.`temperature_high` AS `f__temperature_high`, `c`.`id` AS `c__id`, `c`.`name` AS `c__name` 
FROM `location` `l` 
INNER JOIN `forecast` `f` ON `l`.`id` = `f`.`location_id` 
INNER JOIN `condition` `c` ON `f`.`condition_id` = `c`.`id` 
WHERE (
    `f`.`day` IN ('2012-12-28', '2012-12-29') AND 
    EXISTS (
        SELECT 
            `f2`.`id` AS `f2__id` 
        FROM `forecast` `f2` 
        WHERE (
            `f2`.`location_id` = `l`.`id` AND `f2`.`day` = "2012-12-28" AND `f2`.`condition_id` IN (6, 7, 9, 10, 11, 12, 13, 14, 18) AND `f2`.`temperature_high` <= 30 AND `f2`.`temperature_low` >= 0
        )
    ) AND 
    EXISTS (
        SELECT 
            `f3`.`id` AS `f3__id` 
        FROM `forecast` `f3` 
        WHERE (
            `f3`.`location_id` = `l`.`id` AND `f3`.`day` = "2012-12-29" AND `f3`.`condition_id` IN (6, 7, 9, 10, 11, 12, 13, 14, 18) AND `f3`.`temperature_high` <= 30 AND `f3`.`temperature_low` >= 0
        )
    ) 
    AND `l`.`latitude` IS NOT NULL AND `l`.`longitude` IS NOT NULL
);

Running an EXPLAIN gives the following output:

+----+--------------------+-------+--------+--------------------------------------+-----------------+---------+------------------------+------+----------------------------------------------+
| id |    select_type     | table |  type  |            possible_keys             |       key       | key_len |          ref           | rows |                    Extra                     |
+----+--------------------+-------+--------+--------------------------------------+-----------------+---------+------------------------+------+----------------------------------------------+
|  1 | PRIMARY            | f     | range  | location_id_idx,condition_id_idx,day | day             |       3 | NULL                   | 1298 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | l     | eq_ref | PRIMARY                              | PRIMARY         |       8 | weather.f.location_id  |    1 | Using where                                  |
|  1 | PRIMARY            | c     | eq_ref | PRIMARY                              | PRIMARY         |       8 | weather.f.condition_id |    1 |                                              |
|  3 | DEPENDENT SUBQUERY | f3    | ref    | location_id_idx,condition_id_idx,day | location_id_idx |       9 | weather.l.id           |  276 | Using where                                  |
|  2 | DEPENDENT SUBQUERY | f2    | ref    | location_id_idx,condition_id_idx,day | location_id_idx |       9 | weather.l.id           |  276 | Using where                                  |
+----+--------------------+-------+--------+--------------------------------------+-----------------+---------+------------------------+------+----------------------------------------------+

I understand that the problem is "Using where; Using temporary; Using filesort" and the sub queries but i'm unsure how to re-write this for performance.

Thanks in advance for any pointers which can help me with problems like this in the future.

Pete

Upvotes: 0

Views: 117

Answers (1)

user1914530
user1914530

Reputation:

In your query you are supplying predicates to restrict the forecast set but are joining your tables out of order. The mysql optimiser does some work and uses the day key in forecast and restricts the forecast set using f.day IN ('2012-12-28', '2012-12-29') before joining the other tables. The optimiser is not able to use your dependent subqueries (which look unneccessary) to further restrict the forecast set by condition_id, temperature_high and temperature_low before joining so this is done after all joins have been made. In order to improve the query you should remove your dependent subqueries and restrict the forecast set to begin with.

SELECT *
FROM forecast f 
INNER JOIN location l 
  ON l.id = f.location_id AND l.latitude IS NOT NULL AND l.longitude IS NOT NULL
INNER JOIN condition c
  ON f.condition_id = c.id
WHERE f.day IN ('2012-12-28', '2012-12-29') 
  AND f.condition_id IN (6, 7, 9, 10, 11, 12, 13, 14, 18) 
  AND f.temperature_high <= 30 AND f.temperature_low >= 0

Upvotes: 1

Related Questions