Why is there a big difference between the results of EXPLAIN and Slow query?

Question

The result of explain on MySQL 5.6:

EXPLAIN SELECT c1, c2, c3, c4, c5
FROM t1
WHERE ((c3 = 1489930231868609 and c4 in (7, 1169) and c2 between '2018-05-29 10:33:35.495' and '2020-05-29 10:33:35.495'))
ORDER BY c2 desc, c1 desc;

| id | select_type | table | type  | possible_keys         | key       | key_len   | ref  | rows | Extra       |
|  1 | SIMPLE      | t1    | range | idx_c3_c2, c2_c4_idx  | idx_c3_c2 |   13      | NULL |   21 | Using where |

WHERE statement is immutable.

The result of slow query:

# Query_time: 6.397257  Lock_time: 1.034133 Rows_sent: 0  Rows_examined: 1914

Why is there a big difference between the results of EXPLAIN rows and Slow Query rows_examined even if WHERE statement is immutable?

UPDATE

show create table:

mysql> show create table t1 \G;
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `c1` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `c2` datetime NOT NULL,
  `c3` bigint(20) unsigned NOT NULL,
  `c4` bigint(20) unsigned NOT NULL,
  `account_item` varchar(255) NOT NULL,
  `title` varchar(255) NOT NULL,
  `content` varchar(255) NOT NULL,
  `note` varchar(255) DEFAULT NULL,
  `insert_date` datetime NOT NULL,
  `update_date` datetime NOT NULL,
  PRIMARY KEY (`c1`,`c2`),
  KEY `idx_c3_c2` (`c3`,`c2`),
  KEY `c2_c4_idx` (`c2`,`c4)
) ENGINE=InnoDB AUTO_INCREMENT=2081930928 DEFAULT CHARSET=utf8
/*!50500 PARTITION BY RANGE  COLUMNS(c2)
(PARTITION part_201501 VALUES LESS THAN ('2015-01-01 00:00:00') ENGINE = InnoDB,
 ...
 PARTITION part_202010 VALUES LESS THAN ('2020-10-01 00:00:00') ENGINE = InnoDB,
 ... */

UPDATE

I EXPLAIN this query on MySQL 5.7 for how many partitions OPTIMIZER check.

id|select_type|table          |partitions                                                                                                                                                                                                                                                     |type |possible_keys                                 |key               |key_len|ref|rows|filtered|Extra                             |
--|-----------|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|----------------------------------------------|------------------|-------|---|----|--------|----------------------------------|
 1|SIMPLE     |t1 |part_201806,part_201807,part_201808,part_201809,part_201810,part_201811,part_201812,part_201901,part_201902,part_201903,part_201904,part_201905,part_201906,part_201907,part_201908,part_201909,part_201910,part_201911,part_201912,part_202001,part_202002,par|range|idx_account_number,execute_date_partner_id_idx|idx_account_number|13     |   |1914|    20.0|Using index condition; Using where|

Bill Karwin · Accepted Answer

EXPLAIN can only give an estimate, because it happens before the query is executed.

In other words, it guesses, based on statistics about cardinality of values in the index.

The query log happens after the query is executed (hence the measurements of query time and lock time). It reports an accurate count of how many rows were examined during this query.

The optimizer's guess can be quite wrong, but making it more accurate would require storing a lot more information about the indexes, and make the query optimization take more time.

And usually, the difference isn't important, unless it leads the optimizer to choose the wrong index. In other words, if the optimizer chooses the best index even though its estimated row count was not perfect, what's the problem?

Why is there a big difference between the results of EXPLAIN and Slow query?

Answers (2)

Related Questions