Reputation: 17010
I'm losing hair on a stupid query. First, I would explain what's its goal. I have a set of values fetched every hour and stored in the DB. These values can increase or stay equal with time. This query extracts the latest value day by day for latest 60 days (I have twins query for extract lastest value by weeks and months, they are similar). The query is self explanatory:
SELECT l.value AS value
FROM atable AS l
WHERE l.time = (
SELECT MAX(m.time)
FROM atable AS m
WHERE DATE(l.time) = DATE(m.time)
LIMIT 1
)
ORDER BY l.time DESC
LIMIT 60
It looks no special. But it's extremely slow (> 30 secs), considering time
is an index and table contains less than 5000 rows. And I'm sure the problem is with sub-query.
Where is the noob mistake?
Update 1: Same situation if I avoid MAX()
using SELECT m.time ... ORDER BY m.time DESC
.
Update 2: Seems is not a problem with DATE()
function called to many times. I've tried to create a calculated field day DATE
. The UPDATE atable SET day = DATE(time)
runs in less than 2secs. The modified query, with l.day = m.day
(no functions!), runs in the same exactly time as before.
Upvotes: 3
Views: 682
Reputation: 48149
Based on the feedback answer, if the entries are sequentially added via date/time, directly correlated to the auto-increment ID, who cares about the TIME... get the auto-inc number for exact, non-ambiguous join
select
A1.AutoID,
A1.time,
A1.Value
from
( select date( A2.time ) as SingleDate,
max( A2.AutoID ) as MaxAutoID
from aTable A2
where date( A2.Time ) >= date( date_sub( now(), interval 60 day ))
group by date( A2.time ) ) into MaxPerDate
JOIN aTable A1
on MaxPerDate.MaxAutoID = A1.AutoID
order by
A1.AutoID DESC
Upvotes: 1
Reputation: 53870
The main issue I see is using DATE()
on the left of the expression in the WHERE
clause. Using the function DATE()
on both sides of the WHERE
expression explicitly prevents MySQL from using an index on the date field. Instead, it must scan all rows to apply the function on each row.
Instead of this:
WHERE DATE(l.time) = DATE(m.time)
Try something like this:
WHERE l.time BETWEEN
DATE_SUB(m.date, INTERVAL TIME_TO_SEC(m.date) SECOND)
AND DATE_ADD(DATE_SUB(m.date, INTERVAL TIME_TO_SEC(m.date) SECOND), INTERVAL 86399 SECOND)
Maybe you know of a better way to turn m.date
into a range like 2012-02-09 00:00:00
and 2012-02-09 23:59:59
than the above example, but the idea is that you want to keep the left side of the expression as the raw column name, l.time
in this case, and give it a range in the form of two constants (or two expressions that can be converted to constants) on the right side.
EDIT
I'm using your pre-calculated day
field:
SELECT *
FROM atable a
WHERE a.time IN
(SELECT MAX(time)
FROM atable
GROUP BY day
ORDER BY day DESC
LIMIT 60)
At least here, the inner query is only ran once, and then a binary search is done with the IN
cluase. You're still scanning the table, but just once, and the advantage of the inner query being run just once will probably make a huge dent.
If you know that you have values for every day, you could improve that inner query by adding a WHERE
clause, limiting it to the last 60 calendar days, and losing the LIMIT 60
. Make sure that day
and time
are indexed.
Upvotes: 2
Reputation: 5062
Your outer query is using a filesort without indexes. Try changing to InnoDB engine to see if it improves things.
Doing a quick test:
mysql> show create table atable\G
*************************** 1. row ***************************
Table: atable
Create Table: CREATE TABLE `atable` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`t` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `t` (`t`)
) ENGINE=InnoDB AUTO_INCREMENT=51 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> explain SELECT id FROM atable AS l WHERE l.t = ( SELECT MAX(m.t) FROM atable AS m WHERE DATE(l.t) = DATE(m.t) LIMIT 1 ) ORDER BY l.t DESC LIMIT 50;
+----+--------------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| 1 | PRIMARY | l | index | NULL | t | 4 | NULL | 50 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | m | index | NULL | t | 4 | NULL | 50 | Using where; Using index |
+----+--------------------+-------+-------+---------------+------+---------+------+------+--------------------------+
2 rows in set (0.00 sec)
After changing to MyISAM:
mysql> explain SELECT id FROM atable AS l WHERE l.t = ( SELECT MAX(m.t) FROM atable AS m WHERE DATE(l.t) = DATE(m.t) LIMIT 1 ) ORDER BY l.t DESC LIMIT 50;
+----+--------------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
| 1 | PRIMARY | l | ALL | NULL | NULL | NULL | NULL | 50 | Using where; Using filesort |
| 2 | DEPENDENT SUBQUERY | m | index | NULL | t | 4 | NULL | 50 | Using where; Using index |
+----+--------------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
2 rows in set (0.00 sec)
Upvotes: 0
Reputation: 1832
If you have an index on time
, I would suggest getting TOP 1
instead of MAX
as follows:
SELECT l.value AS value
FROM table AS l
WHERE l.time = (
SELECT TOP 1 m.time
FROM table AS m
ORDER BY m.time DESC LIMIT 1
)
ORDER BY l.time DESC LIMIT 60
Upvotes: 0
Reputation: 2920
You could use the "explain" statement to get mysql to tell you what it's doing.
EXPLAIN SELECT l.value AS value
FROM table AS l
WHERE l.time = (
SELECT MAX(m.time)
FROM table AS m
WHERE DATE(l.time) = DATE(m.time) LIMIT 1
)
ORDER BY l.time DESC LIMIT 60
That should at least give you an insight where to look further.
Upvotes: 0
Reputation: 2828
Instead of using MAX(m.time) do the following in the sub-select
SELECT m.time
FROM table AS m
WHERE DATE(l.time) = DATE(m.time)
ORDER BY m.time DESC
LIMIT 1
This might help speed up the query since it is giving the query parser an alternative
However one other piece i noticed is you are using the DATE(l.time) and DATE(m.time) which if your index is not created on DATE(m.time) then you will not be using the index and hence could cause slowness.
Upvotes: 1