Steven A. Lowe
Steven A. Lowe

Reputation: 61223

What is a simple and efficient way to find rows with time-interval overlaps in SQL?

I have two tables, both with start time and end time fields. I need to find, for each row in the first table, all of the rows in the second table where the time intervals intersect.

For example:

           <-----row 1 interval------->
<---find this--> <--and this--> <--and this-->

Please phrase your answer in the form of a SQL WHERE-clause, AND consider the case where the end time in the second table may be NULL.

Target platform is SQL Server 2005, but solutions from other platforms may be of interest also.

Upvotes: 40

Views: 51851

Answers (5)

Lukasz Szozda
Lukasz Szozda

Reputation: 175586

"solutions from other platforms may be of interest also."

SQL Standard defines OVERLAPS predicate:

Specify a test for an overlap between two events.

<overlaps predicate> ::=
   <row value constructor 1>  OVERLAPS <row value constructor 2> 

Example:

SELECT 1
WHERE ('2020-03-01'::DATE, '2020-04-15'::DATE) OVERLAPS 
      ('2020-02-01'::DATE, '2020-03-15'::DATE) 
-- 1

db<>fiddle demo

Upvotes: 2

Sven
Sven

Reputation: 1

And what, if you want to analyse such an overlap on a minute precision with 70m+ rows? the only solution i could make up myself was a time dimension table for the join

else the dublicate-handling became a headache .. and the processing cost where astronomical

Upvotes: -1

Mike S
Mike S

Reputation: 306

It's sound very complicated until you start working from reverse. Below I illustrated ONLY GOOD CASES (no overlaps)! defined by those 2 simple conditions, we have no overlap ranges if condA OR condB is TRUE, so we going to reverse those: NOT condA AND NOT CondB, in our case I just reversed signs (> became <=)

/*
|--------| A                             \___  CondA: b.ddStart >  a.ddEnd
            |=========| B                /      \____ CondB:  a.ddS >  b.ddE
                          |+++++++++| A         /
*/
--DROP TABLE ran
create table ran ( mem_nbr int, ID int, ddS date, ddE date)
insert ran values  
(100, 1,  '2012-1-1','2012-12-30'),    ----\ ovl
(100, 11, '2012-12-12','2012-12-24'),  ----/
(100, 2, '2012-12-31','2014-1-1'),
(100, 3, '2014-5-1','2014-12-14') ,

(220, 1, '2015-5-5','2015-12-14') ,    ---\ovl
(220, 22, '2014-4-1','2015-5-25') ,    ---/
(220, 3, '2016-6-1','2016-12-16')  

select  DISTINCT a.mem_nbr ,  a.* , '-' [ ], b.dds, b.dde, b.id 
FROM ran a
join ran b  on  a.mem_nbr = b.mem_nbr          -- match by mem#
               AND     a.ID <> b.ID            -- itself
                  AND     b.ddS <= a.ddE        -- NOT    b.ddS >  a.ddE       
                  AND     a.ddS <= b.ddE        -- NOT    a.ddS >  b.ddE   

Upvotes: 2

Khoth
Khoth

Reputation: 13328

SELECT * 
FROM table1,table2 
WHERE table2.start <= table1.end 
AND (table2.end IS NULL OR table2.end >= table1.start)

Upvotes: 69

Casper
Casper

Reputation: 1252

select * from table_1 
right join 
table_2 on 
(
table_1.start between table_2.start and table_2.[end]
or
table_1.[end] between table_2.start and table_2.[end]
or
(table_1.[end] > table_2.start and table_2.[end] is null)
)

EDIT: Ok, don't go for my solution, it perfoms like shit. The "where" solution is 14x faster. Oops...

Some statistics: running on a db with ~ 65000 records for both table 1 and 2 (no indexing), having intervals of 2 days between start and end for each row, running for 2 minutes in SQLSMSE (don't have the patience to wait)

Using join: 8356 rows in 2 minutes

Using where: 115436 rows in 2 minutes

Upvotes: 1

Related Questions