Lelouch Lamperouge
Lelouch Lamperouge

Reputation: 8421

Even with seemingly correct indices and enough memory, query runs too long

I have a table with 3 million rows and 6 columns.

The table structure:

| Sample | CREATE TABLE `sample` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `FileMD5` varchar(32) NOT NULL,
  `NoCsumMD5` varchar(32) NOT NULL,
  `SectMD5` varchar(32) NOT NULL,
  `SectNoResMD5` varchar(32) NOT NULL,
  `ImpMD5` varchar(32) NOT NULL,
  `Overlay` tinyint(1) NOT NULL DEFAULT '1',
  PRIMARY KEY (`ID`),
  KEY `FileMD5` (`FileMD5`),
  KEY `NoCsumMD5` (`NoCsumMD5`)
) ENGINE=InnoDB AUTO_INCREMENT=3073630 DEFAULT CHARSET=latin1 |

The temporary table values:

mysql> SHOW VARIABLES LIKE 'tmp_table_size';
+----------------+----------+
| Variable_name  | Value    |
+----------------+----------+
| tmp_table_size | 16777216 |
+----------------+----------+
1 row in set (0.00 sec)
mysql> SHOW VARIABLES LIKE 'max_heap_table_size';
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| max_heap_table_size | 16777216 |
+---------------------+----------+
1 row in set (0.00 sec)

My Query

mysql> explain SELECT NoCsumMD5,Count(FileMD5) 
FROM Sample GROUP BY NoCsumMD5 
HAVING Count(FileMD5) > 10 ORDER BY Count(FileMD5) Desc ;
+----+-------------+--------+-------+---------------+-----------+---------+------+---------+---------------------------------+
| id | select_type | table  | type  | possible_keys | key       | key_len | ref  | rows    | Extra                           |
+----+-------------+--------+-------+---------------+-----------+---------+------+---------+---------------------------------+
|  1 | SIMPLE      | Sample | index | NULL          | NoCsumMD5 | 34      | NULL | 2928042 | Using temporary; Using filesort |
+----+-------------+--------+-------+---------------+-----------+---------+------+---------+---------------------------------+

How can I optimize this query. Even after 10 min, it generates no output.

I feel that I have indexed the right columns and given enough memory for temporary tables.

Upvotes: 0

Views: 54

Answers (2)

newtover
newtover

Reputation: 32094

Since FileMD5 is not NULL in your table definition, the query can be simplified, and you will not need the composite index @brendan-long suggests (NoCsumMD5 index is enough):

SELECT NoCsumMD5, Count(*) as cnt 
FROM Sample
GROUP BY NoCsumMD5 
HAVING cnt > 10
ORDER BY cnt DESC;

Upvotes: 1

Brendan Long
Brendan Long

Reputation: 54312

I'm not sure if this will help, but MySQL can only use one index at a time, so it may be helpful to create an index over both FileMD5 and NoCsumMD5:

KEY `someName` (`NoCsumMD5`, `FileMD5`),

Here's some information on multiple column indexes:

MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.

The short version is that the order of the columns in the index matters, because MySQL can only use the index in that order (for example, in the index I gave above, it can test NoCsumMD5, then narrow the result down using FileMD5).

I'm not sure how much it will help in this query though, since all you care about is whether FileMD5 is NULL or not..

Upvotes: 1

Related Questions