Kashyap MNVL
Kashyap MNVL

Reputation: 603

Recursively loop through a SQL table and find intervals based on Start and End Dates

I have a SQL table that contains employeeid, StartDateTime and EndDatetime as follows:

CREATE TABLE Sample
(
    SNO INT,
    EmployeeID NVARCHAR(10),
    StartDateTime DATE,
    EndDateTime DATE
)

INSERT INTO Sample
VALUES
( 1, 'xyz', '2018-01-01', '2018-01-02' ), 
( 2, 'xyz', '2018-01-03', '2018-01-05' ), 
( 3, 'xyz', '2018-01-06', '2018-02-01' ), 
( 4, 'xyz', '2018-02-15', '2018-03-15' ), 
( 5, 'xyz', '2018-03-16', '2018-03-19' ),
( 6, 'abc', '2018-01-16', '2018-02-25' ),
( 7, 'abc', '2018-03-08', '2018-03-19' ),
( 8, 'abc', '2018-02-26', '2018-03-01' )

I want the result to be displayed as

EmployeeID  |  StartDateTime  |  EndDateTime
------------+-----------------+---------------
   xyz      |  2018-01-01     |  2018-02-01
   xyz      |  2018-02-15     |  2018-03-19
   abc      |  2018-01-16     |  2018-03-01
   abc      |  2018-03-08     |  2018-03-19

Basically, I want to recursively look at records of each employee and datemine the continuity of Start and EndDates and make a set of continuous date records.

I wrote my query as follows:

SELECT * 
FROM dbo.TestTable T1 
LEFT JOIN dbo.TestTable t2 ON t2.EmpId = T1.EmpId
WHERE t1.EndDate = DATEADD(DAY, -1, T2.startdate)

to see if I could decipher something from the output looking for a pattern. Later realized that with the above approach, I need to join the same table multiple times to get the output I desire.

Also, there is a case that there can be multiple employee records, so I need direction on efficient way of getting this desired output.

Any help is greatly appreciated.

Upvotes: 0

Views: 1326

Answers (2)

Serkan Arslan
Serkan Arslan

Reputation: 13393

You can use this.

WITH T AS (
    SELECT  S1.SNO, 
            S1.EmployeeID, 
            S1.StartDateTime, 
            ISNULL(S2.EndDateTime, S1.EndDateTime) EndDateTime,
            ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId ORDER BY S1.StartDateTime) 
                - ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId, CASE WHEN S2.StartDateTime IS NULL THEN 0 ELSE 1 END  ORDER BY S1.StartDateTime ) RN,
            ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId, ISNULL(S2.EndDateTime, S1.EndDateTime) ORDER BY S1.EmployeeId, S1.StartDateTime) RN_END
    FROM Sample S1
        LEFT JOIN Sample S2 ON DATEADD(DAY,1,S1.EndDateTime) = S2.StartDateTime
)
SELECT EmployeeID,  MIN(StartDateTime) StartDateTime,MAX(EndDateTime) EndDateTime FROM T
WHERE RN_END = 1
GROUP BY EmployeeID, RN
ORDER BY EmployeeID DESC, StartDateTime

Result:

EmployeeID StartDateTime EndDateTime
---------- ------------- -----------
xyz        2018-01-01    2018-02-01
xyz        2018-02-15    2018-03-19
abc        2018-01-16    2018-03-01
abc        2018-03-08    2018-03-19

Upvotes: 1

TomC
TomC

Reputation: 2804

This will do it for you. Use a recursive CTE to get all the adjacent rows, then get the highest end date for each start date, then the first start date for each end date.

;with cte as (
    select EmployeeID, StartDateTime, EndDateTime 
    from sample s
    union all
    select CTE.EmployeeID, CTE.StartDateTime, s.EndDateTime
    from sample s
    join cte on cte.EmployeeID=s.EmployeeID and s.StartDateTime=dateadd(d,1,CTE.EndDateTime)
)
select EmployeeID, Min(StartDateTime) as StartDateTime, EndDateTime from (
    select EmployeeID, StartDateTime, Max(EndDateTime) as EndDateTime from cte
    group by EmployeeID, StartDateTime
) q group by EmployeeID, EndDateTime

Upvotes: 2

Related Questions