RTF
RTF

Reputation: 6494

MySQL index query taking long time for specific column value

I have 2 MySQL (Ver 14.14 Distrib 5.5.49) tables that look something like this:

CREATE TABLE `Document` (
    `Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `CompanyCode` int(10) unsigned NOT NULL,
    `B` int(10) unsigned NOT NULL,
    `C` int(10) unsigned NOT NULL,
    `DocumentCode` int(10) unsigned NOT NULL,
    `E` int(11) DEFAULT '0',
    `EpochSeconds` int(11) DEFAULT '0',
    `G` int(10) unsigned NOT NULL,
    `H` int(10) unsigned NOT NULL,
    `I` int(11) DEFAULT '0',
    `J` int(11) DEFAULT '0',
    `K` varchar(48) DEFAULT '',
  PRIMARY KEY (`Id`),
    KEY `Idx1` (`CompanyCode`),
    KEY `Idx2` (`B`,`C`),
    KEY `Idx3` (`CompanyCode`,`DocumentCode`),
    KEY `Idx4` (`CompanyCode`,`B`,`C`),
    KEY `Idx5` (`H`),
    KEY `Idx6` (`CompanyCode`,`K`),
    KEY `Idx7` (`K`),
    KEY `Idx8` (`K`,`E`),
    KEY `NEWIDX` (`DocumentCode`,`EpochSeconds`),
) ENGINE=MyISAM AUTO_INCREMENT=397783215 DEFAULT CHARSET=latin1

CREATE TABLE `Company` (
    `Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `CompanyCode` int(10) unsigned NOT NULL,
    `CompanyName` varchar(150) NOT NULL,
    `C` varchar(2) NOT NULL,
    `D` varchar(10) NOT NULL,
    `E` varchar(150) NOT NULL,
  PRIMARY KEY (`Id`),
    KEY `Idx1` (`CompanyCode`),
    KEY `Idx2` (`CompanyName`),
    KEY `Idx3` (`C`),
    KEY `Idx4` (`D`,`C`)
    KEY `Idx5` (`E`)
) ENGINE=MyISAM AUTO_INCREMENT=9218804 DEFAULT CHARSET=latin1

I've omitted most of the column definitions from Company because I don't want to complicate the question unnecessarily, but those missing columns are not involved in any KEY definitions.

Document has ~12.5 million rows and Company has ~600,000 rows.
I've added the KEY NEWIDX to Document to facilitate the following query:

SELECT Document.*, Company.CompanyName FROM Document, Company where Document.DocumentCode = ? and Document.CompanyCode = Company.CompanyCode ORDER BY Document.EpochSeconds desc LIMIT 0, 30;

Execution Plan:

+----+-------------+--------------+------+-----------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
| id | select_type | table        | type | possible_keys                     | key         | key_len | ref                          | rows   | Extra                           |
+----+-------------+-------+------+------------------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
|  1 | SIMPLE      | Company      | ALL  | Idx1                              | NULL        | NULL    | NULL                         | 593729 | Using temporary; Using filesort |
|  1 | SIMPLE      | Document     | ref  | Idx1,Idx4,Idx6,NEWIDX,Idx3        | Idx3        | 8       | db.Company.CompanyCode,const |      3 |                                 |
+----+-------------+-------+------+-----------------------------------------------------------+-------------+---------+----------------------+--------+------------------------+

If the value for Document.DocumentCode above is anything other than 8, then the query returns instantly (0.00 secs). If the value is 8, then the query takes anywhere between 38 and 45 seconds. If I remove Company from the query e.g.

SELECT * FROM Document where DocumentCode = 8 ORDER BY EpochSeconds desc LIMIT 0, 30;

Execution plan:

+----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+
| id | select_type | table     | type | possible_keys | key        | key_len | ref   | rows    | Extra       |
+----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+
|  1 | SIMPLE      | Documents | ref  | NEWIDX        | NEWIDX     | 4       | const | 3654177 | Using where |
+----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+

...then the query returns instantly (0.00 secs).

I've also run the mysqlcheck utility on the Document table and it doesn't report any problems.

Why might the query where the DocumentCode = 8 be taking so long when the Company join is used in the query, whereas any other value for DocumentCode returns so quickly?


Here's a comparison of the execution plans for DocumentCode = 8:

+----+-------------+--------------+------+-----------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
| id | select_type | table        | type | possible_keys                     | key         | key_len | ref                          | rows   | Extra                           |
+----+-------------+-------+------+------------------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
|  1 | SIMPLE      | Company      | ALL  | Idx1                              | NULL        | NULL    | NULL                         | 593729 | Using temporary; Using filesort |
|  1 | SIMPLE      | Document     | ref  | Idx1,Idx4,Idx6,NEWIDX,Idx3        | Idx3        | 8       | db.Company.CompanyCode,const |      3 |                                 |
+----+-------------+-------+------+-----------------------------------------------------------+-------------+---------+----------------------+--------+------------------------+

and DocumentCode = 9:

+----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+
| id | select_type | table    | type | possible_keys              | key    | key_len | ref                      | rows    | Extra       |
+----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+
|  1 | SIMPLE      | Document | ref  | Idx1,Idx4,Idx6,NEWIDX,Idx3 | NEWIDX | 4       | const                    | 1953090 | Using where |
|  1 | SIMPLE      | Company  | ref  | Idx1                       | Idx1   | 4       | db.Document.CompanyCode  |       1 |             |
+----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+

They're obviously different, but I don't understand them enough to interpret what's happening. Also, performing ANALYZE TABLE Document and ANALYZE TABLE Company both report OK.

Upvotes: 1

Views: 362

Answers (2)

Kickstart
Kickstart

Reputation: 21513

Using a STRAIGHT_JOIN to force the order that MySQL does the join in

SELECT Document.*, 
Company.CompanyName 
FROM Document
STRAIGHT_JOIN Company 
ON Document.CompanyCode = Company.CompanyCode
WHERE Document.DocumentCode = ? 
ORDER BY Document.EpochSeconds DESC
LIMIT 0, 30;

Upvotes: 1

Seb
Seb

Reputation: 1551

The reason for this behaviour lies in the way mysql optimizes your query - or at least tries to. You see that in the explained queries. Mysql changes the table it uses as a basis for the query. With documentCode = 8 it's based on company, with documentCode=9 it's based on document. Mysql thinks, that for documentCode=8 it will be faster, if it's not using the index but instead using the other table as basis. Why exactly I do not know.

I would adivse you to use an explizit join, to tell mysql wich tables to use in wich order:

SELECT Document.*, Company.CompanyName 
FROM Document 
JOIN Company ON Document.CompanyCode = Company.CompanyCode 
WHERE Document.DocumentCode = ?
ORDER BY Document.EpochSeconds desc LIMIT 0, 30;

Mysql even supports telling it, what index it should use:

SELECT Document.*, Company.CompanyName 
FROM Document 
JOIN Company USE INDEX Idx1 ON Document.CompanyCode = Company.CompanyCode 
WHERE Document.DocumentCode = ?
ORDER BY Document.EpochSeconds desc LIMIT 0, 30;

Instead of USE INDEX you can also try FORCE INDEX. That's stronger. But I guess it will by default use Idx1.

But be advised that your new index NEWIDX will not be used for this query, because it needs to join first and filter the result set - which has no index. So this ORDER BY on the results is a very expensive operation.

Upvotes: 1

Related Questions