Reputation: 23
I have the following MySQL (MyISAM) table with about 3 Million rows.
CREATE TABLE `tasks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`node` smallint(6) NOT NULL,
`pid` int(11) NOT NULL,
`job` int(11) NOT NULL,
`a_id` int(11) DEFAULT NULL,
`user_id` int(11) NOT NULL,
`state` int(11) NOT NULL,
`start_time` int(11) NOT NULL,
`end_time` int(11) NOT NULL,
`stop_time` int(11) NOT NULL,
`end_stream` int(11) NOT NULL,
`message` varchar(255) DEFAULT NULL,
`rate` float NOT NULL,
`exiting` int(11) NOT NULL DEFAULT '0',
`bytes` int(11) NOT NULL,
`motion` tinyint(4) NOT NULL,
PRIMARY KEY (`id`),
KEY `a_id` (`a_id`),
KEY `job` (`job`),
KEY `state` (`state`),
KEY `end_time` (`end_time`),
KEY `start_time` (`start_time`),
) ENGINE=MyISAM AUTO_INCREMENT=100 DEFAULT CHARSET=utf8;
Now when I run the following query, MySQL is only using the a_id index and needs to scan a few thousand rows.
SELECT count(id) AS tries FROM `tasks` WHERE ( job='1' OR job='3' )
AND a_id='614' AND state >'80' AND state < '100' AND start_time >='1386538013';
When I add an additional index KEY newkey
(a_id
,state
,start_time
), MySQL is still trying to use a_id only and not newkey. Only when using the hint / force index in the query, it's been used. Changing the fields in the query around does not help.
Any ideas? I don't necessarily want hints in my statements. The fact that MySQL is not doing this automatically indicates to me that there is an issue with my table, keys or query somewhere. Any help is highly appreciated.
Additional info:
mysql> show index in tasks;
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| tasks | 0 | PRIMARY | 1 | id | A | 3130554 | NULL | NULL | | BTREE | | |
| tasks | 1 | a_id | 1 | a_id | A | 2992 | NULL | NULL | YES | BTREE | | |
| tasks | 1 | job | 1 | job | A | 5 | NULL | NULL | | BTREE | | |
| tasks | 1 | state | 1 | state | A | 9 | NULL | NULL | | BTREE | | |
| tasks | 1 | end_time | 1 | end_time | A | 1565277 | NULL | NULL | | BTREE | | |
| tasks | 1 | newkey | 1 | a_id | A | 2992 | NULL | NULL | YES | BTREE | | |
| tasks | 1 | newkey | 2 | state | A | 8506 | NULL | NULL | | BTREE | | |
| tasks | 1 | newkey | 3 | start_time | A | 3130554 | NULL | NULL | | BTREE | | |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
EXPLAIN with and without quotes:
mysql> DESCRIBE SELECT count(id) AS tries FROM `tasks` WHERE ( job='1' OR job='3' ) AND a_id='614' AND state >'80' AND state < '100' AND start_time >='1386538013';
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| 1 | SIMPLE | tasks | ref | a_id,job,state,newkey | a_id | 5 | const | 740 | Using where |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
1 row in set (0.10 sec)
mysql> DESCRIBE SELECT count(id) AS tries FROM `tasks` WHERE ( job=1 OR job=3 ) AND a_id = 614 AND state > 80 AND state < 100 AND start_time >= 1386538013;
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| 1 | SIMPLE | tasks | ref | a_id,job,state,newkey | a_id | 5 | const | 740 | Using where |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
1 row in set (0.01 sec)
Upvotes: 2
Views: 149
Reputation: 48179
A few things... I would have a SINGLE compound index on ( a_id, job, state, start_time )
This to help optimize the query on all the criteria, in what I believe is the best tuned sequence. A single "A_ID", then two jobs, a small state range, then time based. Next, notice no quotes... It appears you were converting numeric to string comparisons, leave them as numeric for compare -- faster than strings.
Also, by having them all as part of the index, it is a COVERING index meaning it does NOT have to go to the raw page data to get the other values to test the qualifying records to include or not.
SELECT
count(*) AS tries
FROM
tasks
WHERE
a_id = 614
AND job IN ( 1, 3 )
AND state > 80 AND state < 100
AND start_time >= 1386538013;
Now, the why the index... consider the following scenario. You have two rooms that have boxes... In the first room, each box is an "a_id", within that are the jobs in order, within each job are the state ranges, and finally by start time.
In another room, your boxes are sorted by start time, within that a_id are sorted, and finally state.
Which would be easier to find what you need. That is how you should think on the indexes. I would rather go to one box for "A_ID = 614", then jump to Job 1 and another for Job 3. Within each Job 1, Job 3, grab 80-100, then time. You however know better your data and volume in each criteria consideration and may adjust.
Finally, the count(ID) vs count(*). All I care about is a record qualified. I don't need to know the actual ID as the filtering criteria already qualified as include or not, why look (in this case) for the actual "ID".
Upvotes: 1
Reputation: 10246
Most of a_id=614
's state has > 80 and < 100, then it could be happened. Have you tried one of below indexes?
INDEX(a_id, start_time, state)
INDEX(start_time, a_id, state)
Upvotes: 0
Reputation: 13725
Probably mysql thinks that using the a_id key will using less IO. Probably the cardinality of the key a_id is good enough. What explains of the hinted/hintless queries say?
Upvotes: 0