James Santiago
James Santiago

Reputation: 3062

Reduce/Summarize and Replace Timestamped Records

I have a SQL table that has timestamped records for server performance data. This data is polled and stored every 1 minute for multiple servers. I want to keep data for a large period of time but reduce the number records for data older than six months.

For example, I have some old records like so:

    Timestamp  Server  CPU  App1  App2
1   ... 00:01  Host1   5    1     10    
2   ... 00:01  Host2   10   5     20
3   ... 00:02  Host1   6    0     11
4   ... 00:02  Host2   11   5     20
5   ... 00:03  Host1   4    1     9
6   ... 00:04  Host2   9    6     19

I want to be able to reduce this data from every minute to every 10 minutes or possibly every hour for older data.

My initial assumption is that I'd average the values for times within a 10 minute time period and create a new timestamped record after deleting the old records. Could I create a sql query that generates the insert statements for the new summarized records? What would that query look like?

Or is there a better way to accomplish this summarization job?

Upvotes: 0

Views: 49

Answers (2)

Ed Gibbs
Ed Gibbs

Reputation: 26343

You might also want to consider moving the summarized information into a different table so you don't end up in a situation where you're wondering if you're looking at "raw" or summarized data. Other benefits would be that you could include MAX, MIN, STDDEV and other values along with the AVG.

The tricky part is chunking out the times. The best way I could think of was to start with the output from the CONVERT(blah, Timestamp, 120) function:

-- Result: 2015-07-08 20:50:55
SELECT CONVERT(VARCHAR(19), CURRENT_TIMESTAMP, 120)

By cutting it off after the hour or after the 10-minute point you can truncate the times:

-- Hour; result is 2015-07-08 20
SELECT CONVERT(VARCHAR(13), CURRENT_TIMESTAMP, 120)

-- 10-minute point; result is 2015-07-08 20:50:5
SELECT CONVERT(VARCHAR(15), CURRENT_TIMESTAMP, 120)

With a little more massaging you can fill out the minutes for either one and CAST it back to a DATETIME or DATETIME2:

-- Hour increment
CAST(CONVERT(VARCHAR(13), CURRENT_TIMESTAMP, 120) + ':00' AS DATETIME)

-- 10-minute increment
CAST(CONVERT(VARCHAR(15), CURRENT_TIMESTAMP, 120) + 0' AS DATETIME)

Using the logic above, all times are truncated. In other words, the hour formula will convert Timestamp where 11:00 <= Timestamp < 12:00 to 11:00. The minute formula will convert Timestamp where 11:20 <= Timestamp < 11:30 to 11:20.

So the better part query looks like this (I've left out getting rid of the rows you've just summarized):

-- The hour-increment version
INSERT INTO myTableOrOtherTable
SELECT
  CAST(CONVERT(VARCHAR(13), [Timestamp], 120) + ':00' AS DATETIME),
  AVG(CPU),
  AVG(App1),
  AVG(App2)
FROM myTable
GROUP BY
  CAST(CONVERT(VARCHAR(13), [Timestamp], 120) + ':00' AS DATETIME)

Upvotes: 1

Felix Pamittan
Felix Pamittan

Reputation: 31879

Assuming you have record for every minute, this is how you can group your records by 10 minutes:

SELECT
    [Timestamp] = MIN([Timestamp]),
    [Server],
    CPU = AVG(CPU),
    App1 = AVG(App1),
    App2 = AVG(App2)
FROM (
    SELECT *,
        RN = (ROW_NUMBER() OVER(PARTITION BY [Server] ORDER BY [Timestamp]) - 1) / 10
    FROM temp
)t
GROUP BY [Server], RN

Upvotes: 1

Related Questions