John Reese
John Reese

Reputation: 573

Why does MySQL issue a filesort?

This is my table definition:

CREATE TABLE difficulty (
  uuid binary(16) NOT NULL,
  createdTimestamp datetime(6) DEFAULT NULL,
  modifiedTimestamp datetime(6) DEFAULT NULL,
  name varchar(255) DEFAULT NULL,
  PRIMARY KEY (uuid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE exercise_session (
  uuid binary(16) NOT NULL,
  createdTimestamp datetime(6) DEFAULT NULL,
  modifiedTimestamp datetime(6) DEFAULT NULL,
  type varchar(16) DEFAULT NULL,
  status int(11) DEFAULT NULL,
  difficulty_uuid binary(16) DEFAULT NULL,

  PRIMARY KEY (uuid),
  KEY (difficulty_uuid),
  KEY (difficulty_uuid,modifiedTimestamp),
  KEY (modifiedTimestamp),
  KEY (status),
  FOREIGN KEY (difficulty_uuid) REFERENCES difficulty (uuid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

This is my query:

SELECT s.difficulty_uuid, s.modifiedTimestamp, d.name
FROM exercise_session s
INNER JOIN difficulty d ON s.difficulty_uuid=d.uuid
ORDER BY s.modifiedTimestamp DESC
LIMIT 20

Some data:

INSERT INTO difficulty (uuid, createdTimestamp, modifiedTimestamp, name) VALUES
    (0x00000000000000000000000000000000, NULL, NULL, 'difficulty');

INSERT INTO exercise_session (uuid, createdTimestamp, modifiedTimestamp, type, status, difficulty_uuid) VALUES
    (0x00000000000000000000000000000000, NULL, '2017-09-09 03:47:42.000000', '0', 0, 0x00000000000000000000000000000000),
    (0x00000000000000000000000000000001, NULL, '2017-09-09 03:47:42.000000', '0', 0, 0x00000000000000000000000000000000);

When I run my query prefixed with EXPLAIN this is the output:

+------+-------------+-------+------+-----------------+-----------------+---------+--------------+------+---------------------------------+
| id   | select_type | table | type | possible_keys   | key             | key_len | ref          | rows | Extra                           |
+------+-------------+-------+------+-----------------+-----------------+---------+--------------+------+---------------------------------+
|    1 | SIMPLE      | d     | ALL  | PRIMARY         | NULL            | NULL    | NULL         |    1 | Using temporary; Using filesort |
|    1 | SIMPLE      | s     | ref  | difficulty_uuid | difficulty_uuid | 17      | dbname.d.uuid |    1 |                                 |
+------+-------------+-------+------+-----------------+-----------------+---------+--------------+------+---------------------------------+

Why is MySQL / MariaDB doing a filesort instead of using the composite index?

Upvotes: 0

Views: 401

Answers (3)

John Reese
John Reese

Reputation: 573

We fixed the issue by not joining the difficultyies, but instead doing one query using "WHERE IN" to load all of them afterwards.

The proposed fix suggested by @Bill Karwin does not work since we are not using MySQL 8.0 (especially not a preview). Under MariaDB 10.2.8 it did not work.

Rick James' suggestion does not work either, since we are loading more than one column from the difficulty table.

Upvotes: 0

Bill Karwin
Bill Karwin

Reputation: 562438

You should probably fill these tables with more test data before making conclusions about the optimizer's treatment of them. Just having one or two rows in each table is probably going to trigger atypical optimizer plans.

I tried filling the tables with a couple of dozen rows, and got this EXPLAIN:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: s
   partitions: NULL
         type: index
possible_keys: difficulty_uuid,difficulty_uuid_2
          key: difficulty_uuid_2
      key_len: 26
          ref: NULL
         rows: 11
     filtered: 100.00
        Extra: Using where; Using index; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: d
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 16
          ref: test.s.difficulty_uuid
         rows: 1
     filtered: 100.00
        Extra: NULL

Still a filesort, but at least it accessed the tables in the right order: s first, then join to d on its primary key, resulting in an eq_ref type access.

Using an index hint that a table-scan is too costly to consider, it uses the index on modifiedTimestamp:

EXPLAIN SELECT s.difficulty_uuid, s.modifiedTimestamp, d.name 
FROM exercise_session s FORCE INDEX(modifiedTimestamp) 
JOIN difficulty d ON s.difficulty_uuid=d.uuid 
ORDER BY s.modifiedTimestamp DESC LIMIT 20

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: s
   partitions: NULL
         type: index
possible_keys: NULL
          key: modifiedTimestamp
      key_len: 9
          ref: NULL
         rows: 11
     filtered: 100.00
        Extra: Using where; Backward index scan
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: d
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 16
          ref: test.s.difficulty_uuid
         rows: 1
     filtered: 100.00
        Extra: NULL

No longer any filesort, but we see Backward index scan which is new in MySQL 8.0 (I'm testing with the preview build).

The blog introducing this feature http://mysqlserverteam.com/mysql-8-0-labs-descending-indexes-in-mysql/ mentions that MySQL 5.7 can scan an index backwards, but it's about 15% more costly than doing a forward index scan. Jeremey Cole has done blogs and presentations on InnoDB index internals, and I remember he showed exactly why descending index scans are more costly, but I can't recall it clearly enough to explain here.

Upvotes: 2

Rick James
Rick James

Reputation: 142306

See what the EXPLAIN for this is like, especially with more than one row:

SELECT  s.difficulty_uuid, s.modifiedTimestamp, 
        (
            SELECT  name
                FROM  difficulty
                WHERE  uuid = s.difficulty_uuid 
        ) AS name
    FROM  exercise_session s
    ORDER BY  s.modifiedTimestamp DESC
    LIMIT  20 

BTW, UUIDs are awful as keys when you have huge tables. The are not really needed except in a few 'distributed' architecture situation. Consider the user of AUTO_INCREMENT instead.

Upvotes: 1

Related Questions