Anton Soradoi
Anton Soradoi

Reputation: 1851

Why does a MySQL query take anywhere from 1 millisecond to 7 seconds?

I have an SQL query(see below) that returns exactly what I need but when ran through phpMyAdmin takes anywhere from 0.0009 seconds to 0.1149 seconds and occasionally all the way up to 7.4983 seconds.

Query:

SELECT
  e.id,
  e.title,
  e.special_flag,
  CASE WHEN a.date >= '2013-03-29' THEN a.date ELSE '9999-99-99' END as date
  CASE WHEN a.date >= '2013-03-29' THEN a.time ELSE '99-99-99' END as time,
  cat.lastname,
  FROM e_table as e
  LEFT JOIN a_table as a ON (a.e_id=e.id)
  LEFT JOIN c_table as c ON (e.c_id=c.id)
  LEFT JOIN cat_table as cat ON (cat.id=e.cat_id)
  LEFT JOIN m_table as m ON (cat.name=m.name AND cat.lastname=m.lastname)
  JOIN (
          SELECT DISTINCT innere.id
          FROM e_table as innere
          LEFT JOIN a_table as innera ON (innera.e_id=innere.id AND
                                          innera.date >= '2013-03-29')
          LEFT JOIN c_table as innerc ON (innere.c_id=innerc.id)
          WHERE (
                  (
                    innera.date >= '2013-03-29' AND 
                    innera.flag_two=1
                  ) OR 
                  innere.special_flag=1
                ) AND
                innere.flag_three=1 AND 
                innere.flag_four=1
          ORDER BY COALESCE(innera.date, '9999-99-99') ASC,
                   innera.time ASC,
                   innere.id DESC LIMIT 0, 10
       ) AS elist ON (e.id=elist.id)
  WHERE (a.flag_two=1 OR e.special_flag) AND e.flag_three=1 AND e.flag_four=1
  ORDER BY a.date ASC, a.time ASC, e.id DESC

Explain Plan: The above query explain plan

The question is: Which part of this query could be causing the wide range of difference in performance?

Upvotes: 3

Views: 3016

Answers (2)

DRapp
DRapp

Reputation: 48139

Lets start that both your outer and inner query are working with the "e" table WITH a minimum requirement of flag_three = 1 AND flag_four = 1 (regardless of your inner query's (( x and y ) or z) condition. Also, your outer WHERE clause has explicit reference to the a.Flag_two, but no NULL which forces your LEFT JOIN to actually become an (INNER) JOIN. Also, it appears every "e" record MUST have a category as you are looking for the "cat.lastname" and no coalesce() if none found. This makes sense at it appears to be a "lookup" table reference. As for the "m_table" and "c_table", you are not getting or doing anything with it, so they can be removed completely.

Would the following query get you the same results?

select 
      e1.id,
      e1.Title,
      e1.Special_Flag,
      e1.cat_id,
      coalesce( a1.date, '9999-99-99' ) ADate,
      coalesce( a1.time, '99-99-99' ) ATime
      cat.LastName
   from
      e_table e1
         LEFT JOIN a_table as a1
             ON e1.id = a1.e_id
            AND a1.flag_two = 1
            AND a1.date >= '2013-03-29'

         JOIN cat_table as cat 
             ON e1.cat_id = cat.id
   where
          e1.flag_three = 1
      and e1.flag_four = 1 
      and (   e1.special_flag = 1
           OR a1.id IS NOT NULL )
   order by
      IF( a1.id is null, 2, 1 ),
      ADate,
      ATime,
      e1.ID Desc
   limit
      0, 10

The Main WHERE clause qualifies for ONLY those that have the "three and four" flags set to 1 PLUS EITHER the ( special flag exists OR there is a valid "a" record that is on/after the given date in question).

From that, simple order by and limit.

As for getting the date and time, it appears that you only want records on/after the date to be included, otherwise ignore them (such as they are old and not applicable, you don't want to see them).

The order by, I am testing FIRST for a NULL value for the "a" ID. If so, we know they will all be forced to a date of '9999-99-99' and time of '99-99-99' and want them pushed to the bottom (hence 2), otherwise, there IS an "a" record and you want those first (hence 1). Then, sort by the date/time respectively and then the ID descending in case many within the same date/time.

Finally, to help on the indexes, I would ensure your "e" table has an index on

( id, flag_three, flag_four, special_flag ).

For the "a" table, index on

(e_id, flag_two, date)

Upvotes: 1

Niels Keurentjes
Niels Keurentjes

Reputation: 41958

To specifically answer your question: it's not a specific part of the query that's causing the wide range of performance. That's MySQL doing what it's supposed to do - being a Relational Database Management System (RDBMS), not just a dumb SQL wrapper around comma separated files.

When you execute a query, the following things happen:

  1. The query is compiled to a 'parametrized' query, eliminating all variables down to the pure structural SQL.
  2. The compilation cache is checked to find whether a recent usable execution plan is found for the query.
  3. The query is compiled into an execution plan if needed (this is what the 'EXPLAIN' shows)
  4. For each execution plan element, the memory caches are checked whether they contain fresh and usable data, otherwise the intermediate data is assembled from master table data.
  5. The final result is assembled by putting all the intermediate data together.

What you are seeing is that when the query costs 0.0009 seconds, the cache was fresh enough to supply all data together, and when it peaks at 7.5 seconds either something was changed in the queried tables, or other queries 'pushed' the in-memory cache data out, or the DBMS has other reasons to suspect it needs to recompile the query or fetch all data again. Probably some of the other variations have to do with used indexes still being cached freshly enough in memory or not.

Concluding this, the query is ridiculously slow, you're just sometimes lucky that caching makes it appear fast.

To solve this, I'd recommend looking into 2 things:

  1. First and foremost - a query this size should not have a single line in its execution plan reading "No possible keys". Research how indexes work, make sure you realize the impact of MySQL's limitation of using a single index per joined table, and tweak your database so that each line of the plan has an entry under 'key'.
  2. Secondly, review the query in itself. DBMS's are at their fastest when all they have to do is combine raw data. Using programmatic elements like CASE and COALESCE are by all means often useful, but they do force the database to evaluate more things at runtime than just take raw table data. Try to eliminate such statements, or move them to the business logic as post-processing with the retrieved data.

Finally, never forget that MySQL is actually a rather stupid DBMS. It is optimized for performance in simple data fetching queries such as most websites require. As such it is much faster than SQL Server and Oracle for most generic problems. Once you start complicating things with functions, cases, huge join or matching conditions etc., the competitors are frequently much better optimized, and have better optimization in their query compilers. As such, when MySQL starts becoming slow in a specific query, consider splitting it up in 2 or more smaller queries just so it doesn't become confused, and do some postprocessing in PHP or whatever language you are calling with. I've seen many cases where this increased performance a LOT, just by not confusing MySQL, especially in cases where subqueries were involved (as in your case). Especially the fact that your subquery is a derived table, and not just a subquery, is known to complicate stuff for MySQL beyond what it can cope with.

Upvotes: 6

Related Questions