Reputation: 2332
I have a few million records in the mysql table TEST.
One of the column (TRIAL_TIME) of TEST table stores EPOCH time as BIGINT. A sql query is fired which uses GROUP BY clause to group data on the TRIAL_TIME.
The query is something like this.
SELECT SUM(A1), COUNT(B1)
from TEST
WHERE <some clause>
GROUP BY TRIAL_TIME DIV 300000
ORDER BY <some column>;
300000 in the above query indicates the time I would like to group the data with. SO for example, if I had like to group data by 1 minute, I would use 60000. The query then becomes
SELECT SUM(A1), COUNT(B1)
from TEST
WHERE <some clause>
GROUP BY TRIAL_TIME DIV 600000
ORDER BY <some column>;
Question is
One of the possible solution could be to add new column and parse the EPOCH time to extract DATE, TIME etc field and update the newly created columns with the appropriate values so that GROUP BY becomes easier.
Wondering if this is a wise solution?
NOTE - For the record use mysql 5.1 and Infobright engine. The current query uses about 3 minutes to execute (because of the GROUP BY CLAUSE). The performance goal is to bring it under 30 seconds.
Upvotes: 0
Views: 848
Reputation: 142306
WHERE ... -- With a good index, this _might_ be less of a problem; otherwise it needs scan
GROUP BY FLOOR(ts/300000) -- adding a column will not help
ORDER BY something_else -- this will force [another] sort
How many rows are you scanning? If it is a large number, it is not reasonable to expect high speed without having some form of Summary table.
You mentioned Infobright, but you have not mentioned which key 'preferred' in the storage of the data. Infobright will skip chunks of 64K rows that do not match the WHERE
clause; are you taking advantage of that? If not, all relevant columns need to be unpacked from all chunks.
Summary tables -- however, it was not written with Infobright in mind.
Upvotes: 1