Devbreaker
Devbreaker

Reputation: 19

How can I optimize mysql's date group by speed?

CREATE TABLE `device_m1000` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`sensor_id` MEDIUMINT(9) NOT NULL,
`ctime` DATETIME NOT NULL,
`now_data` FLOAT(12) NOT NULL,
`total_data` FLOAT(12) NOT NULL,
PRIMARY KEY (`id`) USING BTREE,
INDEX `sensor_id` (`sensor_id`) USING BTREE,
INDEX `ctime` (`ctime`) USING BTREE) COLLATE='utf8_unicode_ci' ENGINE=InnoDB AUTO_INCREMENT=296533;

enter image description here

There are 300,000 data in the table above. The total_data column always comes in accumulated usage. (A structure that always increases) If I want to know the usage of "today", I take min value and max value based on today's date and subtract it.

try sql

SELECT ROUND((max.v - min.v), 2) total
FROM   (SELECT
               DATE(ctime) `date`,
               sum(total_data)  v
        FROM   device_m1000
        WHERE  ( sensor_id, ctime ) IN (SELECT sensor_id,
                                               MAX(ctime) AS dt
                                        FROM   device_m1000
                                        WHERE  ctime >= '2020-11-23 06:30' and ctime < '2020-11-24 06:30'
                                        GROUP  BY sensor_id
                                        ORDER  BY null)
        group by `date`) max
       INNER JOIN (SELECT
                          DATE(ctime) `date`,
                          sum(total_data)  v
                   FROM   device_m1000
                   WHERE  ( sensor_id, ctime ) IN (SELECT sensor_id,
                                                          MIN(ctime) AS dt
                                                   FROM   device_m1000
                                                   WHERE  ctime >= '2020-11-23 06:30' and ctime < '2020-11-24 06:30'
                                                   GROUP  BY sensor_id
                                                   ORDER  BY null)
                   group by `date`) min
               ON min.`date` = max.`date`;

Use the following query to get the data. However, it simply delays more than 10 seconds in the code at max(ctime). How can I optimize it?

Upvotes: 0

Views: 61

Answers (2)

Rick James
Rick James

Reputation: 142298

  • "Row constructors" perform poorly. (WHERE (a,b) IN ...)
  • FLOAT(12) does not give you 12 significant digits, you get only about 7.
  • Is id useful? Or is (sensor_id, ctime) unique? If it is unique, make it the PRIMARY KEY and get rid of id.
  • If you can't make it the PK, have INDEX(sensor_id, ctime).

Back to your question. Abandon your attempt to put the subtotal in each row. Virtually any approach will be performaing over a billion operations per second -- because of having to sum up about 30K items from the day for each sensor.

Instead make a separate "summary table" with date, sensor, and total for the day. Then, when you INSERT into the detail table (like what you have), also do

INSERT INTO daily_summary (date, sensor, tot)
    VALUES (?, ?, ?)
    ON DUPLICATE KEY UPDATE
        tot = VALUES(tot) + ?

And use DOUBLE for tot. And have PRIMARY KEY(sensor, date)

This avoids your subtract and provides a lot of efficiency.

More on summary tables: http://mysql.rjweb.org/doc.php/summarytables

If you need hourly subtotals, then make it an hourly table instead of daily. You can efficiently get the daily amount from adding the 24 rows from the hourly summary table.

Upvotes: 0

GMB
GMB

Reputation: 222442

I understand that you want the different between the latest and earliest total_data of today for each sensor. If so, you can use window functions like this:

select sensor_id, sum(case when rn_desc = 1 then total_data else - total_data end) as total_diff
from (
    select d.*, 
        row_number() over(partition by sensor_id order by ctime) rn_asc,
        row_number() over(partition by sensor_id order by ctime desc) rn_desc
    from device_m1000
    where ctime >= current_date and ctime < current_date + interval 1 day
) t
where 1 in (rn_asc, rn_desc)
group by sensor_id

Actually if total_data is always increasing, this is simpler (and works in all MySQL versions):

select sensor_id, max(total_data) - min(total_data) as total_diff
from device_m1000
where ctime >= current_date and ctime < current_date + interval 1 day
group by sensor_id

Then I would recommend the following index: (ctime, sensor_id, total_data).

Upvotes: 1

Related Questions