Oliver Martin
Oliver Martin

Reputation: 213

How to find most overlapping time period with date ranges

Suppose you have a table with an identifier, a start time and an end time. These start and end times can be any length of time. The start time is always before the end time. Assume there are no nulls.

What kind of query would tell me the most "popular" time, i.e. where the two ranges in each row overlaps with the most other rows?

The real life application of this is that it's a table recording users' sign in and sign out times. I want to write a query that will tell me when the most concurrent users were logged in and see what period of time this was.

Thank you.

Upvotes: 6

Views: 2381

Answers (2)

Lukas Eder
Lukas Eder

Reputation: 220762

Here's an example solution using a simple self-join and a GROUP BY:

WITH d(id, t1, t2) AS (
    SELECT 1, date '2010-01-01', date '2010-03-01' FROM DUAL UNION ALL
    SELECT 2, date '2010-02-01', date '2010-04-01' FROM DUAL UNION ALL
    SELECT 3, date '2010-02-01', date '2010-04-01' FROM DUAL UNION ALL
    SELECT 4, date '2010-01-01', date '2010-01-03' FROM DUAL UNION ALL
    SELECT 5, date '2011-01-01', date '2011-02-15' FROM DUAL
)
SELECT d1.id, d1.t1, d1.t2, 
       COUNT(*) "Overlap count", 
       LISTAGG('[' || d2.t1 || ', ' || d2.t2 || ']', ', ')
       WITHIN GROUP (ORDER BY d2.id) "Overlapping intervals"
FROM d d1 
LEFT OUTER JOIN d d2 
ON d2.t1 <= d1.t2 AND d1.t1 <= d2.t2
GROUP BY d1.id, d1.t1, d1.t2
ORDER BY COUNT(*) DESC

The "Overlapping intervals" aggregation is for illustration only.

SQLFiddle

... with output:

| ID | OVERLAP COUNT |                                                                          OVERLAPPING INTERVALS |
|----|---------------|------------------------------------------------------------------------------------------------|
|  1 |             4 | [01-JAN-10, 01-MAR-10], [01-FEB-10, 01-APR-10], [01-FEB-10, 01-APR-10], [01-JAN-10, 03-JAN-10] |
|  2 |             3 |                         [01-JAN-10, 01-MAR-10], [01-FEB-10, 01-APR-10], [01-FEB-10, 01-APR-10] |
|  3 |             3 |                         [01-JAN-10, 01-MAR-10], [01-FEB-10, 01-APR-10], [01-FEB-10, 01-APR-10] |
|  4 |             2 |                                                 [01-JAN-10, 01-MAR-10], [01-JAN-10, 03-JAN-10] |
|  5 |             1 |                                                                         [01-JAN-11, 15-FEB-11] |

Upvotes: 5

Gordon Linoff
Gordon Linoff

Reputation: 1269513

There are several approaches to this. One uses correlated subqueries. That isn't much fun. Instead, let's use the cumulative sum method because you have Oracle.

The key is to start with a list of timestamps with a value of +1 for a start and -1 for an end. This is easy:

select t.*
from ((select starttime as thetime, 1 as value from table t) union all
      (select endtime, -1 as value from table t)
     ) t

Now, the cumulative sum of the value tells you the number of active overlaps at any given time:

select t.*, sum(value) over (order by thetime) as numactives
from ((select starttime as thetime, 1 as value from table t) union all
      (select endtime, -1 as value from table t)
     ) t

This solves your problem. You probably want to add an order by numactives desc for the specific times.

Upvotes: 7

Related Questions