IHeartTurbo
IHeartTurbo

Reputation: 23

Combining records with overlapping date ranges in SQL

**EDIT: Our current server is SQL 2008 R2 so LAG/LEAD functions will not work.

I'm attempting to take multiple streams of data within a table and combine them into 1 stream of data. Given the 3 streams of data below I want the end result to be 1 stream that gives preference to the status 'on'. Recursion seems to be the best option but I've had no luck so far putting together a query that does what i want.

CREATE TABLE #Dates(
    id INT IDENTITY,
    status VARCHAR(4),
    StartDate Datetime,
    EndDate Datetime,
    booth int)

INSERT #Dates
VALUES
( 'off','2015-01-01 08:00','2015-01-01 08:15',1),
( 'on','2015-01-01 08:15','2015-01-01 09:15',1),
( 'off','2015-01-01 08:50','2015-01-01 09:00',2),
( 'on','2015-01-01 09:00','2015-01-01 09:30',2),
( 'off','2015-01-01 09:30','2015-01-01 09:35',2),
( 'on','2015-01-01 09:35','2015-01-01 10:15',2),
( 'off','2015-01-01 09:30','2015-01-01 10:30',3),
( 'on','2015-01-01 10:30','2015-01-01 11:00',3)

status  StartDate   EndDate
---------------------------
off     08:00       08:15
on      08:15       09:15
off     08:50       09:00
on      09:00       09:30
off     09:30       09:35
on      09:35       10:15
off     09:30       10:30
on      10:30       11:00

End Result:

status  StartDate   EndDate
---------------------------
off     8:00        8:15
on      8:15        9:15
on      9:15        9:30
off     9:30        9:35
on      9:35        10:15
off     10:15       10:30
on      10:30       11:00

Essentially, anytime there is a status of 'on' it should override any concurrent 'off' status.

Source:
|----off----||---------on---------|
                     |---off--||------on----||---off---||--------on------|
                                             |--------------off------------------||------on------|

Result (Either result would work):
|----off----||----------on--------||---on---||---off---||--------on------||-off--||------on------|
|----off----||----------on------------------||---off---||--------on------||-off--||------on------|

Upvotes: 2

Views: 2681

Answers (2)

James Z
James Z

Reputation: 12318

Here's the simplest version for 2008 that I was able to figure out:

; with Data (Date) as (
select StartDate from Dates
union
select EndDate from Dates),

Ranges (StartDate, Status) as (
select D.Date, D2.Status
from Data D
outer apply (
  select top 1 D2.Status
  from Dates D2
  where D2.StartDate <= D.Date and D2.EndDate > D.Date
  order by case when Status = 'on' then 1 else 2 end
) D2)

select R.StartDate,
(select min(D.Date) from Data D where D.Date > R.StartDate) as EndDate,
Status
from Ranges R
order by R.StartDate

It will return new row starting from each start / end point even if the status is the same as previous. Didn't find any simple way to combine them.

Edit: Changing the first CTE to this will combine the rows:

; with Data (Date) as (
select distinct StartDate from Dates D1
where not exists (Select 1 from Dates D2 
    where D2.StartDate < D1.StartDate and D2.EndDate > D1.StartDate and
    Status = 'on')
union
select distinct EndDate from Dates D1
where not exists (Select 1 from Dates D2 
    where D2.StartDate < D1.EndDate and D2.EndDate > D1.EndDate and
    Status = 'on')
),

Upvotes: 1

James Z
James Z

Reputation: 12318

So basically every time there's even one "on" record, it is on, otherwise off?

Here's a little different kind of approach to the issue, adding +1 every time an "on" cycle starts, and adding -1 when it ends. Then we can use a running total for the status, and when the status is 0, then it's off, and otherwise it is on:

select Date, 
sum(oncounter) over (order by Date) as onstat,
sum(offcounter) over (order by Date) as offstat
from (
select StartDate as Date, 
  case when status = 'on' then 1 else 0 end oncounter,
  case when status = 'off' then 1 else 0 end offcounter
from Dates
union all
select EndDate as Date,
  case when status = 'on' then -1 else 0 end oncounter,
  case when status = 'off' then -1 else 0 end offcounter
from Dates
) TMP

Edit: Added also counter for off -states. It works the same way as "on" counter and when both are 0, then status is neither on or off.

Final result, it seems it can be done, although it's not looking that nice anymore, but at least it's not recursive :)

select
    Date as StartDate,
    lead(Date, 1, '21000101') over (order by Date) as EndDate,
    case onstat 
        when 0 then 
            case when offstat > 0 then 'Off' else 'N/A' end
        else 'On' end as State
from (
    select
        Date,
        onstat, prevon,
        offstat, prevoff
    from (
        Select 
            Date,
            onstat,
            lag(onstat, 1, 0) over (order by Date) as prevon,
            offstat,
            lag(offstat, 1, 0) over (order by Date) as prevoff
        from (
            select 
                Date, 
                sum(oncounter) over (order by Date) as onstat,
                sum(offcounter) over (order by Date) as offstat
            from (
                select 
                    StartDate as Date, 
                    case when status = 'on' then 1 else 0 end oncounter,
                    case when status = 'off' then 1 else 0 end offcounter
                from 
                    Dates

            union all

                select 
                    EndDate as Date,
                    case when status = 'on' then -1 else 0 end oncounter,
                    case when status = 'off' then -1 else 0 end offcounter
                from 
                    Dates
            ) TMP
        ) TMP2
    ) TMP3
    where (onstat = 1 and prevon = 0)
    or (onstat = 0 and prevon = 1)
    or (onstat = 0 and offstat = 1 and prevoff = 0)
    or (onstat = 0 and offstat = 0 and prevoff = 1)
) TMP4

It has quite many derived tables for the window functions and getting only the status changes into the result set so lead can pick up correct dates. It might be possible to get rid of some of them.

SQL Fiddle: http://sqlfiddle.com/#!6/b5cfa/7

Upvotes: 0

Related Questions