Reputation: 13878
Using volkszaehler.org I need to retrieve data from a million+ rows table, below is what the ORM creates:
CREATE TABLE `data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`channel_id` int(11) DEFAULT NULL,
`timestamp` bigint(20) NOT NULL,
`value` double NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `ts_uniq` (`channel_id`,`timestamp`),
KEY `IDX_ADF3F36372F5A1AA` (`channel_id`)
)
Now, selection grouped data is slow, especially when run on low-performance platforms like Raspberry Pi:
SELECT MAX(timestamp) AS timestamp, SUM(value) AS value, COUNT(timestamp) AS count
FROM data WHERE channel_id = 4 AND timestamp >= 1356994800000 AND timestamp <= 1375009341000
GROUP BY YEAR(FROM_UNIXTIME(timestamp/1000)), DAYOFYEAR(FROM_UNIXTIME(timestamp/1000));
Explain:
SIMPLE data ref ts_uniq,IDX_ADF3F36372F5A1AA ts_uniq 5 const 2066 Using where; Using temporary; Using filesort
The query needs to go through 50k records, takes 1.5s on Core i5 and already 6s on RasPi.
Is there anything to improve performance apart from reducing amount of data?
Upvotes: 2
Views: 613
Reputation: 4511
Increasing amount of data, not decreasing it, that's what you need: you have two functions in GROUP BY clause, and if this you calculate YEAR(FROM_UNIXTIME(timestamp/1000))
and DAYOFYEAR(FROM_UNIXTIME(timestamp/1000))
beforehand in a trigger and store values to additional fields, your SELECT statement will be much faster.
Other than that, you may simply truncate timestamp
to nearest day by dividing it by 1000*3600*24=86400000 and group by only one field, because I can't see a point in grouping by year and day of year separately, when you can group only by date:
SELECT
MAX(timestamp) AS timestamp,
SUM(value) AS value,
COUNT(timestamp) AS count
FROM data WHERE
channel_id = 4 AND
timestamp >= 1356994800000 AND
timestamp <= 1375009341000
GROUP BY timestamp/86400000;
Personally, after that I would add date field, index it and update it in trigger, so that I could remove all arithmetic expressions from GROUP BY. In that case index will be used.
Upvotes: 1