David Rodrigues
David Rodrigues

Reputation: 12532

Organizing `where` to make it faster

The important thing

Is possible identify the best order of WHERE conditionals to make it faster? For instance, I have a query with 6 conditionals. Some simple, other with sub-queries or functions. My idea is profile the query to identify how common the conditionals is true and how much it cost to run.

Read of Examples

For instance, I have it:

WHERE
    table.ageMin >= :ageFilter AND       #1 age-more-than
    table.ageMax <= :ageFilter AND       #2 age-less-than
    YEAR(table.date) >= :dateFilter AND  #3 year
    (SELECT ...) = TRUE AND              #4 first-query
    (SELECT ...) = FALSE AND             #5 second-query
    USER_FUNCTION(table.tag, :tagFilter) #6 user-function

So let say that, in a month, we have that:

  1. Is true in 95% of times;
  2. Is true in 69% of times;
  3. Is true in 15% of times;
  4. Is true in 97% of times;
  5. Is true in 50% of times;
  6. Is true in 99% of times;

And considering that (pseudo ms values):

  1. Is a simple conditional and cost 0.005ms for query;
  2. Is a simple conditional and cost 0.005ms for query;
  3. Use MySQL YEAR() and cost 0.030ms for query;
  4. Use sub-query and cost 0.140ms for query;
  5. Use sub-query and cost 0.260ms for query;
  6. Use user-function and cost 0.450ms for query;

So, I probably will think like that:

  1. #3 will fail in 85% of time and cost only 0.030ms for query;
  2. #2 will fail in 31% of time and cost only 0.005ms for query;
  3. #1 will fail in 5%, but cost only 0.005ms for query;
  4. #5 cost more than #4 by +0.120ms, but will fail in 50% of times;
  5. #4 cost 0.140ms, but fail only 3% of times;
  6. #6 cost 0.450ms, but fail only 1% of times;

So my WHERE order will be like:

WHERE
    YEAR(table.date) >= :dateFilter AND  #3 up, up
    table.ageMax <= :ageFilter AND       #2 maintained
    table.ageMin >= :ageFilter AND       #1 down down
    (SELECT ...) = FALSE AND             #5 up
    (SELECT ...) = TRUE AND              #4 down
    USER_FUNCTION(table.tag, :tagFilter) #6 maintained

So if I have 1.000.000 of queries in one month, on first example I'll have:

  1. 5% will fail in age-more-than, spent 5.000 seconds, rest 950.000;
  2. 31% will fail in age-less-than, spent 4.750 seconds, rest 655.500;
  3. 85% will fail in year, spent 19.665 seconds, rest 98.325;
  4. 3% will fail in first sub-query, spent 13.765 seconds, rest 95.375;
  5. 50% will fail in second sub-query, spent 24.797 seconds, rest 47.687;
  6. 1% will fail in user-function, spent 21.459 seconds, rest 47.210;

So in 1 million of queries, only 47.210 is true. And it'll spent 89.436 seconds to process all queries a long of month.

After I reorder my WHEREs, it'll be:

  1. 85% will fail in year, spent 30.000 seconds, rest 150.000;
  2. 31% will fail in age-less-than, spent 750 seconds, rest 103.500;
  3. 5% will fail in age-more-than, spent 517 seconds, rest 98.325;
  4. 50% will fail in second sub-query, spent 25.564 seconds, rest 49.165;
  5. 3% will fail in first sub-query, spent 6.883 seconds, rest 47.690;
  6. 1% will fail in user-function, spent 21.460 seconds, rest 47.210;

So in 1 million of queries, it'll spent 85.174, about 4.262 seconds less that original order. The difference is of 5% in this simple comparison, but I have tables with about 50 conditionals (some complexes, other not). And I tried to fix order and the query reduced from 0.500ms to 0.075ms.

But, it is a hard work to do, check one by one, and decide which is more common and less cost. So, there are some tool to make this work?

Upvotes: 2

Views: 38

Answers (4)

O. Jones
O. Jones

Reputation: 108651

MySQL's query planner reorders WHERE clauses joined together by AND to come up with what it guesses is an optimal execution plan.

You can't affect this by messing around with the order of clauses in your query.

You CAN affect it by using ANALYZE TABLE once your table is mostly populated.

You MAYBE can affect it by creating one or more compound indexes with columns that match the columns in your query. Read this. http://use-the-index-luke.com/

If you're benchmarking stuff, be sure to write SELECT SQL_NO_CACHE instead of SELECT at the beginning of your queries. Otherwise, MySQL will satisfy repeated queries from the cache and appear very fast.

Upvotes: 1

axiac
axiac

Reputation: 72226

The order of conditions in the WHERE clause does not matter. The MySQL query optimizer does a lot of types of changes to the query in order to make it run as fast as it can.

Your query is slow because of the sub-queries. The optimizer can convert some types of sub-queries into table joins and this change brings a big improvement on the running speed.

I cannot suggest any improvements to your sub-queries because you didn't provide them.

You can improve the evaluation of condition YEAR(table.date) >= :dateFilter easily. The usage of a function (YEAR()) prevents MySQL to use an index. If you change it to compare table.date against the first day of the year (put 2015-01-01 instead of 2015 in dateFilter, for example) then MySQL can use an index and speed up the execution.

Of course, MySQL uses an index if you have created one for that column.Be sure you have indexes on columns date, ageMin and ageMax.

Put EXPLAIN in front of your query and check the result against the documentation of EXPLAIN output format to find out why your query is slow and how you can improve it.

Upvotes: 2

54l3d
54l3d

Reputation: 3973

What if we order (may be forced by hints) the where conditions so that we eliminate the maximum of rows from the first condition, if we start the where by the #6 condition that eliminate 99% of rows, the rest of conditions will work only with the 1% left, and so on. So i suggest to order conditions from the condition that eliminate the maximum of rows ending by the least one.

Upvotes: 0

Norbert
Norbert

Reputation: 6084

No, the optimizer in mysql does not care about the order of the where conditions. It uses statistics (histogram) to determine in what order to apply the where statement (AND use the correct indexes: Have your indexes in place). If it is really off target you can use a hint and force the use of a certain index, but then your query is optimized for just this one case, which can be really detrimental in performance for other cases

Upvotes: 1

Related Questions