Jean-Philippe Chenel
Jean-Philippe Chenel

Reputation: 21

SQL Server : group by datetime interval

Running SQL Server 2016 Express.

I have this table with sample data:

ID    Datetime2              other columns
------------------------------------------
1     2017-01-14 11:00:00    ...
1     2017-01-14 11:01:00    ...
1     2017-01-14 11:02:00    ...
1     2017-01-14 11:03:00    ...
1     2017-01-14 11:10:00    ... --> 7 minutes gap
1     2017-01-14 11:11:00    ...
1     2017-01-14 11:20:00    ... --> 9 minutes gap
1     2017-01-14 11:22:00    ...
1     2017-01-14 11:24:00    ...

I want to have this kind of result

ID   start               end                 other columns
-----------------------------------------------------------
1    2017-01-14 11:00    2017-01-14 11:03    ...
1    2017-01-14 11:10    2017-01-14 11:11    ...
1    2017-01-14 11:20    2017-01-14 11:24    ...

A new group must me created when we have a gap of 5 or X minutes or more between current row datetime value and next row datetime value.

I have this kind of query, but I can't figure out where to group rows when it have a gap of 5 minutes or more.

WITH groups(DateTimeField, grp) AS 
(
    SELECT DISTINCT
        DateTimeField,
        DATEDIFF(MINUTE, DateTimeField, lag(DateTimeField) OVER (ORDER BY DateTimeField DESC)) grp
    FROM 
        MyTable
    WHERE 
        ID = 1
)
SELECT
    COUNT(*) AS consecutiveDates,
    MIN(DateTimeField) AS minDate,
    MAX(DateTimeField) AS maxDate
FROM
    groups
GROUP BY 
    grp
ORDER BY 
    1 DESC, 2 DESC

Best regards,

Upvotes: 0

Views: 1626

Answers (2)

Jean-Philippe Chenel
Jean-Philippe Chenel

Reputation: 21

Thanks to @apc. Revised the code to add lag function instead of join and removed the rows where the [Break]<>0

//..
--use of the lag function

INSERT INTO @DatesTables2
SELECT 
    ROW_NUMBER() OVER (ORDER BY a.DateTime2 DESC),
    a.DateTime2 AS DateA, 
    lag(DateTime2 ) OVER (ORDER BY DateTime2 DESC) AS DateB,
    DATEDIFF(MINUTE, a.DateTime2 , lag(DateTime2 ) OVER (ORDER BY DateTime2 DESC)) AS DiffMin,
    CASE WHEN  DATEDIFF(MINUTE, a.DateTime2 , lag(DateTime2 ) OVER (ORDER BY DateTime2 DESC)) > 5 THEN 1 ELSE 0 END AS [Break]
FROM DatesTables a

//..
--Query only where [Break=0] to discard the end value where the gap is more than 5 minutes.

INSERT INTO @Groups
SELECT a.DateA, a.DateB, a.DiffMin, a.[Break], SUM(b.[Break]) + a.[Break] AS [Group]
FROM @DatesTables2 a JOIN @DatesTables2 b ON b.newid < a.newid
WHERE a.[Break] = 0
GROUP BY a.DateA, a.DateB, a.[Break], a.DiffMin

//..
--Finally get the datetime diff between min and max.

SELECT [Group], MIN(DateA) AS Start, MAX(DateB) AS [End], DATEDIFF(MINUTE, MIN(DateA), MAX(DateB))
FROM @Groups
GROUP BY [Group]
ORDER BY [Group]

Upvotes: 0

apc
apc

Reputation: 5586

Firstly for testing I have generated some random dates:

DECLARE @DatesTables TABLE (ID INT, [DateTime2] DATETIME)

DECLARE @ID INT
DECLARE @Date DATETIME
SET @Date = GETDATE()

WHILE (SELECT COUNT(*) FROM @DatesTables) < 50
BEGIN
    SET @ID = (SELECT COUNT(*) FROM @DatesTables) + 1
    SET @Date = DATEADD(MINUTE, rand() * 7 + 1,@Date)
    INSERT INTO @DatesTables (ID, [DateTime2]) VALUES (@ID, @Date)
END

Now lets calculate the difference between each row and the next and work out where our groups should start:

DECLARE @DatesTables2 TABLE ([NewID] INT, [DateA] DATETIME, [DateB] DATETIME, DiffMin INT, [Break] INT)

INSERT INTO @DatesTables2
SELECT 
    ROW_NUMBER() OVER (ORDER BY a.ID),
    a.DateTime2 AS DateA, 
    b.DateTime2 AS DateB,
    DATEDIFF(MINUTE, a.[DateTime2], b.[DateTime2]) AS DiffMin,
    CASE WHEN  DATEDIFF(MINUTE, a.[DateTime2], b.[DateTime2]) > 5 THEN 1 ELSE 0 END AS [Break]
FROM @DatesTables a JOIN @DatesTables b ON a.id = b.ID - 1

Assign each row a group by summing the number of breaks:

DECLARE @Groups TABLE (DateA DATETIME, DateB DATETIME,  DiffMin INT, [Break] INT, [Group] INT)

INSERT INTO @Groups
SELECT a.DateA, a.DateB,a.DiffMin, a.[Break], SUM(b.[Break]) + a.[Break] AS [Group] FROM @DatesTables2 a JOIN @DatesTables2 b 
ON b.newid < a.newid
GROUP BY a.DateA, a.DateB, a.[Break],a.DiffMin

Finally select your grouped results:

SELECT [Group], MIN(DateA) AS Start, MAX(DateB) AS [End] FROM @Groups GROUP BY [Group] ORDER BY [Group]

While I have used table variables for easier understanding you could use sub queries.

Upvotes: 1

Related Questions