Reputation: 1571
I have a table which contains nearly 1 million+ records. I want to find the max record of each group. Here is my sql:
SELECT *
FROM t
WHERE id IN (SELECT max(id) AS id
FROM t
WHERE a = 'some' AND b = 0
GROUP BY c, d);
Table declares as follow.
CREATE TABLE `t` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'id',
`a` varchar(32) NOT NULL COMMENT 'a',
`b` tinyint(3) unsigned NOT NULL COMMENT 'b',
`c` bigint(20) unsigned NOT NULL COMMENT 'c',
`d` varchar(32) NOT NULL COMMENT 'd',
PRIMARY KEY (`id`),
KEY `idx_c_d` (`c`,`d`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='test table';
I have a union index on c and d. So the second statement(SELECT max(id) AS id FROM t WHERE a = 'some' AND b = 0 GROUP BY c, d
) execute in 200ms. But the total statement cost nearly 6 seconds(The result contains 5000 rows).
Here is the explain
shows(some columns are omitted).
+-------------+-------+-------+---------------+--------+---------+----------+--------------------------+
| select_type | table | type | possible_keys | key | rows | filtered | Extra |
+-------------+-------+-------+---------------+--------+---------+----------+--------------------------+
| PRIMARY | t | ALL | NULL | NULL | 9926024 | 100.00 | Using where |
| SUBQUERY | t | index | idx_1 | idex_1 | 9926024 | 1.00 | Using where; Using index |
+-------------+-------+-------+---------------+--------+---------+----------+--------------------------+
Upvotes: 1
Views: 769
Reputation: 1269643
I recommend using a correlated subquery:
SELECT t.*
FROM t
WHERE t.id IN (SELECT MAX(t2.id)
FROM t t2
WHERE t2.c = t.c AND t2.d = t.d AND
t2.a = 'some' AND t2.b = 0
);
This assumes that id
is unique in the table.
For performance, you want an index on (c, d, a, b, id)
.
Upvotes: 0
Reputation: 21513
Avoiding the need for a sub query
SELECT t1.*
FROM t t1
LEFT OUTER JOIN t t2
ON t1.c = t2.c
AND t1.d = t2.d
AND t1.id < t2.id
AND t2.id IS NULL
AND t2.a = 'some'
AND t2.b = 0
Upvotes: 0
Reputation: 48139
All different ways to "skin-a-cat", but here's slightly different... Since you are looking for IN, I would move that query into the front position. Also, it MAY help using MySQL's language specific keyword "STRAIGHT_JOIN" telling MySQL to do in the order you have listed. Again it MAY help
SELECT
T.*
FROM
(SELECT max(id) AS id
FROM t
WHERE b = 0
AND a = 'some'
GROUP BY c, d) PQ
JOIN T
on PQ.ID = T.ID
I would also have index specifically in order of
(b, a, c, d, id )
Obviously keep the primary ID key, and if using STRAIGHT_JOIN, would be
SELECT STRAIGHT_JOIN
T.* ( ... rest of query)
Upvotes: 1
Reputation: 32003
you can try by using corelated subquery and creating index in column c and d
SELECT t1.* FROM table_name t1
WHERE id = (SELECT max(id) AS id FROM table_name t2 where
t1.c=t2.c and t1.d=t2.d
) and t1.a = 'some' AND t1.b = 0
Upvotes: 0