Sean
Sean

Reputation: 408

MySQL Query Explanation: Index choosing when the query has "WHERE" as well as "ORDER BY"

There is something I cannot understand when I try to optimize my query using index.

The table looks like this:

create table order_table (
  id int not null primary key,
  customer_id varchar(50) not null,
  sale int not null,
  key customer_sale_idx (customer_id, sale)
);

when I run explain select * from order_table where customer_id=2 order by sale limit 1;

It shows

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: order_table
   partitions: NULL
         type: index
possible_keys: customer_sale_idx
          key: customer_sale_idx
      key_len: 206
          ref: NULL
         rows: 5
     filtered: 20.00
        Extra: Using where; Using index; Using filesort

And then I add another index alter table order_table add key sale_customer_idx (sale, customer_id);

now the explain select * from order_table where customer_id=2 order by sale limit 1; query shows

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: order_table
   partitions: NULL
         type: index
possible_keys: customer_sale_idx
          key: sale_customer_idx
      key_len: 206
          ref: NULL
         rows: 1
     filtered: 20.00
        Extra: Using where; Using index

Q1: I suppose mysql should check the customer_id first, get those records that match the where, then do the sort by sale. It uses the customer_sale_idx index as I expected, but I don't konw why it use Using filesort rather than utilize the cover index (customer_id, sale) to do the sort.

Q2: MySQL uses the new index sale_customer_idx (sale, customer_id) after I added it. But I don't understand how this index helps to get rid of the Using filesort.

Q3: The latter explain says there is only one possible_keys which is customer_sale_idx, but then it use sale_customer_idx. How?

BTW,I know it's strange that customer_id is varchar. But let's just focus on explaining this phenomenon.

Upvotes: 4

Views: 45

Answers (2)

Solarflare
Solarflare

Reputation: 11106

MySQL has to cast the first column of your index (customer_id, sale) to an integer. So it cannot use the second column to sort.

... where customer_id=2 ...

will be true for customer_ids with the varchar values '2', '02', '002', '2 and a half men' and many more. It cannot jump to '2', read the first row and assume that it is one with the lowest sale-value, as there can also be '02' with a lower sale-value. Only this would allow MySQL to not sort.

Instead it has to go through the whole table, cast every row, check if it = 2 after casting, take the ones that are, and sort whatever is left. That's why you have the filesort there. MySQL doesn't know you have nothing else there that would cast to 2.

In fact, it could do the same with the primary key. Since your index is covering, they are equivalent.

You can verify this if you do the comparison with a varchar, which doesn't require casting, thus the index entries it finds are correctly ordered by the second column sale:

... where customer_id='2' ...

Now it can do exactly as it was expected: jump to '2', take the first row, stop. No sort required. MySQL knows: this is the row with the lowest value for sale.

Your index (sale, customer_id) works the following way: it reads all values, ordered by sale, until it reaches one that is (autocasted) equal to 2. MySQL can be sure now that this is the one you wanted (as you checked all lower values of sale already, none of them fit your condition on customer_id. Thus, no ordering is required.

Upvotes: 1

Gordan Bobić
Gordan Bobić

Reputation: 1878

It looks like there are very few rows in that table. With very small tables, table scans and sorting operations can be cheaper than index scans. Once you have a few thousand rows in it, the execution plan will be more predictable.

Upvotes: 1

Related Questions