jwzk
jwzk

Reputation: 177

Mysql - Summary Tables

Which method do you suggest and why?

Creating a summary table and . . .

1) Updating the table as the action occurs in real time.

2) Running group by queries every 15 minutes to update the summary table.

3) Something else?

The data must be near real time, it can't wait an hour, a day, etc.

Upvotes: 1

Views: 1435

Answers (1)

MJB
MJB

Reputation: 7686

I think there is a 3rd option, which might allow you to manage your CPU resources a little better. How about writing a separate process that periodically updates the summarized data tables? Rather than recreating the summary with a group by, which is GUARANTEED to run slower over time because there will be more rows every time you do it, maybe you can just update the values. Depending on the nature of the data, it may be impossible, but if it is so important that it can't wait and has to be near-real-time, then I think you can afford the time to tweak the schema and allow the process to update it without having to read every row in the source tables.

For example, say your data is just login_data (cols username, login_timestamp, logout_timestamp). Your summary could be login_summary (cols username, count). Once every 15 mins you could truncate the login_summary table, and then insert using select username, count(*) kind of code. But then you'd have to rescan the entire table each time. To speed things up, you could change the summary table to have a last_update column. Then every 15 mins you'd just do an update for every record newer than the last_update record for that user. More complicated of course, but it has some benefits: 1) You only update the rows that changed, and 2) You only read the new rows.

And if 15 minutes turned out to be too old for your users, you could adjust it to run every 10 mins. That would have some impact on CPU of course, but not as much as redoing the entire summary every 15 mins.

Upvotes: 3

Related Questions