Reputation: 33
I'm looking for some suggestion or optimization. Table definition:
CREATE TABLE IF NOT EXISTS MilestonesAndFlags(
id SERIAL,
site_id BIGINT,
milestone BIGINT,
value BIGINT,
TIMESTAMP BIGINT,
timestamp_confirmation BIGINT,
COMMENT TEXT,
INDEX(site_id),
INDEX(milestone),
INDEX(milestone,site_id)
);
In this table I store different milestones with timestamp (to be able to make historical view of any changes) per different sites. Table has about million rows at that time.
The problem occures when I try to get latest actual milestone value per sites using queries like
SELECT site_id,
value
FROM SitesMilestonesAndFlags
WHERE id IN
(SELECT max(id)
FROM SitesMilestonesAndFlags
WHERE milestone=1
GROUP BY milestone,
site_id);
This request execution time is higher 5 minutes on my PC..
EXPLAIN
seems to be OK:
+----+--------------------+--------------------+------+-----------------------+-------------+---------+-------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+------+-----------------------+-------------+---------+-------+--------+--------------------------+
| 1 | PRIMARY | MilestonesAndFlags | ALL | NULL | NULL | NULL | NULL | 1111320| Using where |
| 2 | DEPENDENT SUBQUERY | MilestonesAndFlags | ref | milestone,milestone_2 | milestone_2 | 9 | const | 180660| Using where; Using index |
+----+--------------------+--------------------+------+-----------------------+-------------+---------+-------+--------+--------------------------+
Any suggestion about more correct query or table structure? MySQL >= 5.5
Upvotes: 3
Views: 87
Reputation: 26004
I'll take a shot and propose that you use a temporary aliased table
instead of the where statement that is a dependent subquery
. Not sure if mysql optimized or runs the subquery for every row of the main/outer query.
It would be very interesting if you ran the queries on large data sizes and come back with your results.
SELECT *
FROM MilestonesAndFlags AS MF,
(SELECT max(id) AS id
FROM MilestonesAndFlags
WHERE milestone=1
GROUP BY milestone,
site_id) AS MaxMF
WHERE MaxMF.id = MF.id;
SQLFiddle: http://sqlfiddle.com/#!2/a0d628/10
Pro: Avoidance of dependent subquery.
Cons:
I suspect also that the version of mysql plays a major role in the optimizations done. Below the explain results for 2 different mysql versions where one defined the subquery as dependent and the other as not.
MySQL 5.5.32
ID SELECT_TYPE TABLE TYPE POSSIBLE_KEYS KEY KEY_LEN REF ROWS EXTRA
1 PRIMARY MilestonesAndFlags ALL (null) (null) (null) (null) 29 Using where; Using filesort
2 DEPENDENT SUBQUERY MilestonesAndFlags ref milestone,milestone_2 milestone_2 9 const 15 Using where; Using index
http://sqlfiddle.com/#!2/a0d628/11
MySQL MySQL 5.6.6 m9
ID SELECT_TYPE TABLE TYPE POSSIBLE_KEYS KEY KEY_LEN REF ROWS EXTRA
1 PRIMARY MilestonesAndFlags ALL (null) (null) (null) (null) 29 Using where; Using filesort
2 SUBQUERY MilestonesAndFlags ref milestone,milestone_2 milestone_2 9 const 15 Using where; Using index
http://sqlfiddle.com/#!9/a0d62/2
Upvotes: 1