Stuart
Stuart

Reputation: 130

intersection of dates for an arbitrary number of time series in SQL

How do I calculate the intersection of dates for an arbitrary number of time series in SQL without loops?

Given that I have a table with 3 columns

  1. GroupId
  2. Start Date
  3. End Date

for each GroupId there are multiple rows - so multiple start/end dates per GroupId (within a group there is no overlapping)

What I would like to do is find all the intersecting dates between all the groups So if Group 1 has the following dates

1/1/2001 - 1/31/2001
3/31/2001 - 4/5/2001

Group 2 has the following

1/25/2001 - 5/1/2001

Group 3 has the following

1/22/2001 - 4/1/2001

The above all intersect at:

1/25/2001 - 1/31/2001
3/31/2001 - 1/4/2001

How do I do this in SQL? The following has the logic for 2 date ranges within one table, but I need it for n date ranges in one table:

SELECT 
    case 
        when t1.StartDate > t2.StartDate then t1.StartDate 
        else t2.StartDate 
    END as StartDate,
    CASE WHEN t1.EndDate < t2.EndDate THEN t1.EndDate
        ELSE t2.EndDate
    END as EndDate
FROM Table1 t1
JOIN Table1 t2 ON t1.StartDate <= t2.EndDate AND t1.EndDate >= t2.StartDate    

(note: As an additional constraint I am doing this in linq to sql)

Thanks in advance

Upvotes: 0

Views: 419

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270773

You can do this by separating out the dates in the groups and using cumulative sums and then some comparisons. In SQL Server 2012+, this looks like:

with g as (
      select groupid, start as dte, 1 as enters, 0 as exits
      from t
      union all
      select groupid, dateadd(day, 1, end), 0, 1
      from t
     ),
     gs as (
      select groupid, dte, sum(enters) as enters, sum(exits) as exits,
             sum(sum(enters)) over (order by dte) as cumeenters,
             sum(sum(exits)) over (order by dte) as cumexits
      from g
      group by dte, groupid
     )
select ne, nextdte
from (select gs.*, lead(dte) over (order by dte) as nextdte
      from gs
     ) gs
where cumeenters - cumeexits = (select count(distinct groupid) from t);

The processing is as follows:

  • The first subquery separates the start and end dates into separate records.
  • The second accumulates the starts and the ends to get the cumulative values.
  • The where clause chooses the dates where all the groups are represented.

Here is the SQL Fiddle.

Upvotes: 1

Related Questions