Independent
Independent

Reputation: 2987

How to make this query aggregate the time-grouping correct?

I try to perform a grouping based on rows based on a string and a eventtime. The identical strings must be grouped with eventtime occurrances within period of 30 minutes. If there are more then 30 minutes between eventtimes, there is a new groupset.

So far I found the following attempt (by many similiar, but this is simpliest). With the comments below, i also note that i missunderstood what it does. It does a 30 minutes rounding, which won't fill my need. Perhaps "grouping based on 'within daterange'" is more accurace?.

SELECT  min(eventtime) as mintime, 
        max(eventtime) as maxtime, 
        [stringfield],
        count([stringfield]) as qty
FROM Searches
GROUP BY datediff(mi, '19900101', eventtime) / 30, 
         [stringfield]
ORDER BY min(eventtime)

Resulting in this table enter image description here

See the red and blue arrows. Those two lines should be aggregated. Red arrows says 'identical strings'. The blue arrow says that 'the maxtime are within 30 min range'.

I have also tried 31 minutes without success, in case of >= issue. Can someone see by this T-SQL why it go wrong? And how I can fix it?

[Edit]
The expected output is the two rows with arrows grouped. Which means i expect five rows in the output. The 5th row should not exist (aggregated into first). Which results in Qty 3 in the first row, the maxtime will be '2013-06-01 08:55'.

2013-06-01 08:00 | 2013-06-01 08:55 | 0x2BBF4........26BD38 | 3

Upvotes: 1

Views: 89

Answers (1)

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115520

In the 2012 version of SQL-Server, you can use the analytic LAG() and LEAD() functions for this:

; WITH cte AS
  ( SELECT
        stringfield
      , eventtime
      , rn = ROW_NUMBER() OVER ( PARTITION BY stringfield
                                 ORDER BY eventtime )
      , leadtime = LEAD(eventtime) OVER ( PARTITION BY stringfield
                                          ORDER BY eventtime )
      , firsttime= MIN(eventtime) OVER ( PARTITION BY stringfield )
      , diff = CASE WHEN DATEADD(minute, 30, eventtime)
                         >= LEAD(eventtime) 
                            OVER ( PARTITION BY stringfield
                                   ORDER BY eventtime )
                    THEN 0 ELSE 1
               END 
    FROM 
        Searches
  ) 
SELECT 
    stringfield
  , mintime = COALESCE(LAG(leadtime) OVER ( PARTITION BY stringfield
                                            ORDER BY eventtime )
               ,firsttime) 
  , maxtime = eventtime 
  , qty = rn - COALESCE(LAG(rn) OVER ( PARTITION BY stringfield
                                       ORDER BY eventtime ) 
                       ,0)
FROM 
    cte 
WHERE 
    diff = 1
ORDER BY 
    stringfield,
    maxtime ;

Tested at SQL-Fiddle.

Upvotes: 3

Related Questions