Reputation: 249
I'm using a MySQL database to store values from some energy measurement system. The problem is that the DB contains millions of rows, and the queries take somewhat long to complete. Are the queries optimal? What should I do to improve them?
The database table consists of rows with 15 columns each (t, UL1, UL2, UL3, PL1, PL2, PL3, P, Q1, Q2, Q3,CosPhi1, CosPhi2, CosPhi3, i), where t is time, P is total power and i is some identifier.
Seeing as I display the data in graphs grouped in different intervals (15 minutes, 1 hour, 1 day, 1 month) I want to group the querys as such.
As an example I have a graph that shows the kWh for every day in the current year. The query to gather the data goes like this:
SELECT t, SUM(P) as P
FROM table
WHERE i = 0 and t >= '2015-01-01 00:00:00'
GROUP BY DAY(t), MONTH(t)
ORDER BY t
The database has been gathering measurements for 13 days, and this query alone is already taking 2-3 seconds to complete. Those 13 days have added about 1-1.3 million rows to the db, as a new row gets added every second.
Is this query optimal?
Upvotes: 0
Views: 53
Reputation: 1270421
For this query:
SELECT t, SUM(P) as P
FROM table
WHERE i = 0 and t >= '2015-01-01 00:00:00'
GROUP BY DAY(t), MONTH(t)
ORDER BY t
The optimal index is a covering index: table(i, t, p)
.
2-3 seconds for 1+ million rows suggests that you already have an index.
You may want to consider DRapp's suggestion and use summary tables. In a few months, you will have so much data that historical queries could be taking a long time.
In the meantime, though, indexes and partitioning might provide sufficient performance for your needs.
Upvotes: 1
Reputation: 48159
I would actually create a secondary table that has a column for each DAY, and one for the total. Then, via a trigger, your insert into the detail table can update the secondary aggregate table. This way, you can sum the DAILY table which will be much quicker, and yet still have the per second table if you needed to look at the granular level details.
Having aggregate tables can be a common time-saver for querying, especially for read-only types of data, or data you know wont be changing. Then, if you want more granular detail such as hourly or 15 minute intervals, go directly to the raw data.
Upvotes: 2