Rob
Rob

Reputation: 11788

Get aggregated data by specifying a variable time interval

I do have an archive table in my DB that contains millions of datetime/value pairs. Now I need make that data available on a web page using a javascript chart.

Because of the huge amount of data I need to implement some kind of lazy loading and data aggregation. On the client side, everything already works out fine (with simulated aggregated data) - what's missing is the server-side SQL query that delivers the data.

To be precise: I wonder how to query data by specifying a variable amount of time (given in minutes, for example) where the result gives me the minimum, maximum and average value for that interval.

At the moment I'm using this code which gives me the desired data on a per-hour basis:

SELECT CAST(DATEADD(hour,FLOOR(DATEDIFF(hour,0,CAST(DateValue AS datetime))),0) as datetime) as d, avg(Value) as Av, min(Value) as Mn, max(Value) as Mx
  FROM [TESTDB].[dbo].[ValueArchive]
  WHERE StationId=5 and DataPointId=21507
  GROUP BY CAST(DATEADD(hour,FLOOR(DATEDIFF(hour,0,CAST(DateValue AS datetime))),0) as datetime)
  order by d

This query returns data like this:

Zeit                    Av                  Mn                  Mx
2014-07-31 07:00:00.000 21,5311364673079    21,1000043854118    22,0000043988228
2014-07-31 08:00:00.000 21,5188868579348    21,0000043839216    22,0000043988228
2014-07-31 09:00:00.000 21,5014227301436    10,3                21,9000043973327
2014-07-31 10:00:00.000 21,6056305272473    21,1000043854118    22,200004401803
2014-07-31 11:00:00.000 21,7478380172488    21,300004388392     22,200004401803
2014-07-31 12:00:00.000 21,9086686573894    21,5000043913722    22,4000044047833
2014-07-31 13:00:00.000 22,6290701521411    21,7000043943524    25,2000044465065
2014-07-31 14:00:00.000 24,3890669344227    23,7000044241548    25,1000044450164
2014-07-31 15:00:00.000 23,5097663260819    23,000004413724     24,0000044286251
2014-07-31 16:00:00.000 23,3656704604484    22,9000044122338    23,900004427135
2014-07-31 17:00:00.000 23,3113943797538    22,9000044122338    23,8000044256449
2014-07-31 18:00:00.000 23,3023299996241    22,8000044107437    23,8000044256449
2014-07-31 19:00:00.000 23,3033443003657    22,8000044107437    23,8000044256449
2014-07-31 20:00:00.000 23,265605921441     22,7000044092536    23,8000044256449
2014-07-31 21:00:00.000 23,1962274382304    22,7000044092536    23,8000044256449
2014-07-31 22:00:00.000 23,0771869545565    22,6000044077635    23,5000044211745
2014-07-31 23:00:00.000 22,9476058889577    22,200004401803     23,5000044211745
2014-08-01 00:00:00.000 22,7839329819327    22,3000044032931    23,2000044167042
2014-08-01 01:00:00.000 22,6155113662449    22,0000043988228    23,1000044152141
2014-08-01 02:00:00.000 22,4463981676577    21,9000043973327    22,9000044122338

This is a good start, but I don't know how to change this query in such a way that it works with intervals like 2 minutes, 15 minutes or 5 seconds etc. The intervals start/end dates should be aligned to full hours, so that if you query 10-minute-intervals for a specified time range, the data blocks should contain data from 10:00 to 10:10, 10:10 to 10:20 etc. (This quantizing is important when working with additional independent data series that use the same time x-axis.)

Is there a way to achieve that functionality directly in the DB by using a query (or stored procedure)? Or do I have to fetch all data first and then do the aggregating on the webserver?

Thanks for your help!

Upvotes: 0

Views: 406

Answers (1)

petelids
petelids

Reputation: 12815

As integer division in SQL Server will return an integer you can get the difference in time between your DateValue and some epoch, divide that by your interval and re-multiply it by your interval and convert the result back to a datetime. This will align your intervals to the hour (provided you use values that divide evenly; 7 second intervals won't align but 5 seonds will for example). In the below I'm using minutes but it's trivial to change it to use seconds:

declare @interval_minutes int = 15
select  dateadd(MINUTE, DATEDIFF(MINUTE, '01-jan-1970', DateValue)/@interval_minutes*@interval_minutes, '01-jan-1970') as d
        avg(Value) as Av,
        min(Value) as Mn,
        max(Value) as Mx
FROM    [TESTDB].[dbo].[ValueArchive]
WHERE   StationId=5 and DataPointId=21507
group by DATEDIFF(MINUTE, '01-jan-1970', DateValue)/@interval_minutes

As an example this SQL:

declare @interval_minutes int = 15
select  dateadd(MINUTE, DATEDIFF(MINUTE, '01-jan-1970', getdate())/@interval_minutes*@interval_minutes, '01-jan-1970') as d

will return 2014-10-28 13:15:00.000 when I run it at 13:23 on 28 Oct

Changing the interval to 30 gives 2014-10-28 13:00:00.000.

Upvotes: 1

Related Questions