Reputation: 21
I have this query
SELECT
DC.variationID, COUNT(DISTINCT(DC.userID)) AS conversion
FROM
XXXX AS DC
WHERE
DC.testID = 'XXXX' AND DC.visit > 1
GROUP BY
DC.variationID
This is the description of the table
CREATE TABLE `XXXX`
(
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`userID` bigint(17) NOT NULL,
`testID` bigint(20) NOT NULL,
`variationID` bigint(20) NOT NULL,
`url` bigint(20) NOT NULL,
`time` bigint(20) NOT NULL,
`visit` bigint(20) NOT NULL DEFAULT '1',
`isTestPage` tinyint(1) NOT NULL,
PRIMARY KEY (`id`,`testID`),
KEY `url` (`url`),
KEY `dc3_testIDPage` (`testID`,`url`),
KEY `testid_istest` (`testID`,`isTestPage`),
KEY `dc3_varIDPage` (`variationID`,`url`),
KEY `index_rebond` (`testID`,`visit`,`variationID`),
KEY `dc3_testIDvarIDPage` (`testID`,`variationID`,`url`),
KEY `isTestPage2` (`variationID`,`isTestPage`,`visit`,`userID`),
KEY `user_test_varID_url` (`userID`,`testID`,`variationID`,`url`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
This is the explain of the query
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
'1', 'SIMPLE', 'DC', NULL, 'ref', 'dc3_testIDPage,testid_istest,dc3_varIDPage,index_rebond,dc3_testIDvarIDPage,isTestPage2,user_test_varID_url', 'dc3_testIDvarIDPage', '8', 'const', '13695309', '33.33', 'Using index condition; Using where'
For me the query should use the index 'index_rebond' but unfortunatelly it not use it.I pretty sure that before the query use the index 'index_rebond'.
The query takes a lot of time to complete. Can you please tell me why the query doesn't use the index 'index_rebond' and what is the best way to optimize the query?
This is the result of show index
# Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
'datacollect_v3', '0', 'PRIMARY', '1', 'id', 'A', '25909280', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '0', 'PRIMARY', '2', 'testID', 'A', '25909280', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'url', '1', 'url', 'A', '1657369', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'dc3_testIDPage', '1', 'testID', 'A', '2167', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'dc3_testIDPage', '2', 'url', 'A', '1850256', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'testid_istest', '1', 'testID', 'A', '3813', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'testid_istest', '2', 'isTestPage', 'A', '5721', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'dc3_varIDPage', '1', 'variationID', 'A', '2053', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'dc3_varIDPage', '2', 'url', 'A', '4171834', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'index_rebond', '1', 'testID', 'A', '1811', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'index_rebond', '2', 'visit', 'A', '11357', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'index_rebond', '3', 'variationID', 'A', '17208', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'dc3_testIDvarIDPage', '1', 'testID', 'A', '2049', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'dc3_testIDvarIDPage', '2', 'variationID', 'A', '3513', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'dc3_testIDvarIDPage', '3', 'url', 'A', '929052', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'isTestPage2', '1', 'variationID', 'A', '1891', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'isTestPage2', '2', 'isTestPage', 'A', '3309', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'isTestPage2', '3', 'visit', 'A', '16172', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'isTestPage2', '4', 'userID', 'A', '2712038', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'user_test_varID_url', '1', 'userID', 'A', '1103566', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'user_test_varID_url', '2', 'testID', 'A', '1336479', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'user_test_varID_url', '3', 'variationID', 'A', '1325388', NULL, NULL, '', 'BTREE', '', ''
'datacollect_v3', '1', 'user_test_varID_url', '4', 'url', 'A', '16936138', NULL, NULL, '', 'BTREE', '', ''
Best regards,
Upvotes: 0
Views: 46
Reputation: 142278
PRIMARY KEY (id, testID)
does not make sense unless you are partitioning by testID
. Shouldn't you have PRIMARY KEY (id)
?
If you have some combination of columns that is naturally unique, consider getting rid of id
and using that combo as the PK.
This will be "covering", hence probably significantly faster:
INDEX(testID, visit, variationID, userID)
(And get rid of index_rebond
since it will be redundant.) isTestPage2
is also covering (because the PK is included), but it is ordered inefficiently.
Unless you really expect to have more than 4 billion of the various ids, recommend you switch them to INT UNSIGNED
(here and in the other tables). Cutting this table and its indexes in half will help performance, especially if the dataset is too big to be cached.
Upvotes: 1
Reputation: 133360
Could be the query optimizer prefer dc3_testIDPage or testid_istest because they also start with testID
but you can force the use of an index
SELECT DC.variationID, COUNT(DISTINCT(DC.userID)) AS conversion
FROM XXXX AS DC force index_rebond
WHERE DC.testID = 'XXXX' AND DC.visit > 1
GROUP BY DC.variationID
Upvotes: 0