hepsibitti
hepsibitti

Reputation: 33

Get Max And Min dates for consecutive values in T-SQL

I have a log table like below and want to simplfy it by getting min start date and max end date for consecutive Status values for each Id. I tried many window function combinations but no luck.

This is what I have:

enter image description here

This is what want to see:

enter image description here

Upvotes: 3

Views: 1761

Answers (2)

GMB
GMB

Reputation: 222432

This is a typical gaps-and-islands problem. You want to aggregate groups of consecutive records that have the same Id and Status.

No need for recursion, here is one way to solve it using window functions:

select
    Id,
    Status,
    min(StartDate) StartDate,
    max(EndDate) EndDate
from (
    select
        t.*,
        row_number() over(partition by id order by StartDate) rn1,
        row_number() over(partition by id, status order by StartDate) rn2
    from mytable t
) t
group by
    Id,
    Status,
    rn1 - rn2
order by Id, min(StartDate)

The query works by ranking records over two different partitions (by Id, and by Id and Status). The difference between the ranks gives you the group each record belongs to. You can run the subquery independently to see what it returns and understand the logic.

Demo on DB Fiddle:

Id | Status | StartDate           | EndDate            
-: | :----- | :------------------ | :------------------
 1 | B      | 07/02/2019 00:00:00 | 18/02/2019 00:00:00
 1 | C      | 18/02/2019 00:00:00 | 10/03/2019 00:00:00
 1 | B      | 10/03/2019 00:00:00 | 01/04/2019 00:00:00
 2 | A      | 05/02/2019 00:00:00 | 22/04/2019 00:00:00
 2 | D      | 22/04/2019 00:00:00 | 05/05/2019 00:00:00
 2 | A      | 05/05/2019 00:00:00 | 30/06/2019 00:00:00

Upvotes: 1

PeterHe
PeterHe

Reputation: 2766

Try the following query. First order the data by StartDate and generate a sequence (rid). Then you the recursive cte to get the first row (rid=1) for each group (id,status), and recursively get the next row and compare the start/end date.

;WITH cte_r(id,[Status],StartDate,EndDate,rid)
AS
(
SELECT id,[Status],StartDate,EndDate, ROW_NUMBER() OVER(PARTITION BY Id,[Status] ORDER BY StartDate) AS rid
FROM log_table 
),
cte_range(id,[Status],StartDate,EndDate,rid)
AS
(
  SELECT id,[Status],StartDate,EndDate,rid
  FROM cte_r
  WHERE rid=1
  UNION ALL
  SELECT p.id, p.[Status], CASE WHEN c.StartDate<p.EndDate THEN p.StartDate ELSE c.StartDate END AS StartDate, c.EndDate,c.rid
  FROM cte_range p
  INNER JOIN cte_r c
  ON p.id=c.id
  AND p.[Status]=c.[Status]
  AND p.rid+1=c.rid
 )
 SELECT id,[Status],StartDate,MAX(EndDate) AS EndDate FROM cte_range GROUP BY id,StartDate ;

Upvotes: 1

Related Questions