Finding the largest time overlap in T-SQL

Question

I'm trying to do this on SQL Server 2008 R2.

I have a table with 4 columns:

parent_id INT
child_id INT
start_time TIME
end_time TIME

You should look at the children as sub-processes that run for the parent program. All these sub-processes are run once every day, and each child run within its given time span. I want to find the largest overlap of time intervals for each parent based on the times of its children, i.e. I want to know the longest possible overlap where all the sub-processes are running. The fact that each time span is repeated every day means that even if child's time interval spans midnight (i.e. 23:00-10:00), it can overlap with a child that only runs in the morning (i.e. 07:00-09:00), because even if they don't overlap on "the first day", they will overlap on all subsequent days.

The output should look like this:

parent_id INT
start_time TIME
end_time TIME
valid BIT

Where valid = 1 if an overlap was found and valid = 0 if no overlap was found.

A couple of important pieces of information:

A time interval can span midnight, i.e. start_time = 23:00 and end_time = 03:00, which is a time interval of 4 hours.
Two time intervals may overlap in two different places, i.e. start_time1 = 13:00, end_time1 = 06:00, start_time2 = 04:00, end_time2 = 14:00. This would give the largest overlap as 04:00 - 06:00 = 2 hours.
There may be no overlap common for the children of a given parent, in which case the out put for that parent would be start_time = NULL, end_time = NULL and valid = 0.
If a child interval spans the whole day, then start_time = NULL and end_time = NULL. This was chosen to avoid having a day as 00:00-24:00, which would slice overlaps crossing midnight in two, i.e. parent 3 below would end up having two overlaps (23:00-24:00 and 00:00 - 004:00), in stead of one (23:00-04:00).
An overlap is only an overlap if the time interval is shared by all the children of a parent.
The time span of one child can never be longer than 24 hours.

Take this example:

parent_id  child_id  start_time  end_time
    1         1           06:00     14:00
    1         2           13:00     09:00
    1         3           07:00     09:00
    2         1           12:00     17:00
    2         2           09:00     11:00
    3         1            NULL      NULL
    3         2           23:00     04:00
    4         1            NULL      NULL
    4         2            NULL      NULL
   10         1           06:11     14:00
   10         2           06:00     09:00
   10         3           05:00     08:44
   11         1           11:38     17:00
   11         2           09:02     12:11

These data would produce this result set:

parent_id  start_time  end_time  valid
    1           07:00     09:00    1
    2            NULL      NULL    0
    3           23:00     04:00    1
    4            NULL      NULL    1
   10           06:11     08:44    1
   11           11:38     12:11    1

The overlap for a parent is the time interval that is shared by all its children. So the overlap for parent 10 is found by finding the overlap where all 3 children share time: Child 1 (06:11-14:00) and 2 (06:00-09:00) overlap from 06:11 to 09:00. This overlap time interval is then applied to child 3 (05:00-08:44), which gives an overlap of 06:11 to 08:44, since this interval is the only interval where all 3 children share common time.

I hope this makes sense.

I can do it with a cursor, but I would really prefer to avoid cursors. I have been wracking my brain about how to do it without cursors, but I have come up short. Is there any way of doing it without cursors?

EDIT: Expanded the text for clause 4, to explain the decision of having a full day be NULL to NULL, in stead of 00:00 to 00:00. EDIT: Expanded the examples with two more cases. The new cases have parent ID 10 and 11. EDIT: Inserted explanation of how the overlap for parent 10 is found. EDIT: Clarified clause 3. Added clauses 5 and 6. Went into detail about what this is all about.

Brennan Pope · Accepted Answer

Based on your question, I think your output should be:

parent_id   start_time  end_time    valid
1           07:00       09:00       1
2           NULL        NULL        0
3           23:00       04:00       1
4           NULL        NULL        1
10          06:11       08:44       1
11          11:38       12:11       1

And here is a set-based solution:

DECLARE @Times TABLE
(
    parent_id INT
    ,child_id INT
    ,start_time TIME
    ,end_time TIME
);

INSERT INTO @Times
VALUES
    (1,         1,           '06:00',     '14:00')
    ,(1,         2,           '13:00',     '09:00')
    ,(1,         3,           '07:00',     '09:00')
    ,(2,         1,           '12:00',     '17:00')
    ,(2,         2,           '09:00',     '11:00')
    ,(3,         1,            NULL,      NULL)
    ,(3,         2,           '23:00',     '04:00')
    ,(4,         1,            NULL,      NULL)
    ,(4,         2,            NULL,      NULL)
    ,(10,         1,           '06:11',     '14:00')
    ,(10,         2,           '06:00',     '09:00')
    ,(10,         3,           '05:00',     '08:44')
    ,(11,         1,           '11:38',     '17:00')
    ,(11,         2,           '09:02',     '12:11');


DECLARE @Parents TABLE
(
    parent_id INT PRIMARY KEY
    ,ChildCount INT
)
INSERT INTO @Parents
SELECT 
    parent_id
    ,COUNT(DISTINCT child_id) AS ChildCount
FROM
    @Times
GROUP BY 
    parent_id

DECLARE @StartTime DATETIME2 = '00:00'
DECLARE @MinutesInTwoDays INT = 2880
DECLARE @Minutes TABLE(ThisMinute DATETIME2 PRIMARY KEY);

WITH 
MinutesCTE AS
(
    SELECT 
        1 AS MinuteNumber
        ,@StartTime AS ThisMinute

    UNION ALL

    SELECT 
        NextMinuteNumber
        ,NextMinute
    FROM MinutesCTE
    CROSS APPLY (VALUES(MinuteNumber+1,DATEADD(MINUTE,1,ThisMinute))) NextDates(NextMinuteNumber,NextMinute)
    WHERE 
        NextMinuteNumber <= @MinutesInTwoDays
)
INSERT INTO @Minutes
SELECT ThisMinute FROM MinutesCTE M OPTION (MAXRECURSION 2880);


DECLARE @SharedMinutes TABLE
(
    ThisMinute DATETIME2 
    ,parent_id INT
    ,UNIQUE(ThisMinute,parent_id)
);

WITH TimesCTE AS
(
    SELECT
        Times.parent_id
        ,Times.child_id
        ,CAST(ISNULL(Times.start_time,'00:00') AS datetime2) AS start_time
        ,
        DATEADD
        (   
            DAY
            ,
            CASE 
                WHEN Times.end_time IS NULL THEN 2
                WHEN Times.start_time > Times.end_time THEN 1
                ELSE 0 
            END
            ,CAST(ISNULL(Times.end_time,'00:00') AS datetime2)
        ) as end_time
    FROM
        @Times Times


    UNION ALL

    SELECT
        Times.parent_id
        ,Times.child_id
        ,DATEADD(DAY,1,CAST(Times.start_time as datetime2)) AS start_time
        ,DATEADD(DAY,1,CAST(Times.end_time AS datetime2)) AS end_time
    FROM
        @Times Times
    WHERE
        start_time < end_time

)

--Get minutes shared by all children of each parent
INSERT INTO @SharedMinutes
SELECT 
    M.ThisMinute
    ,P.parent_id
FROM
    @Minutes M
JOIN
    TimesCTE T
    ON 
        M.ThisMinute BETWEEN start_time AND end_time
JOIN
    @Parents P
    ON T.parent_id = P.parent_id

GROUP BY 
    M.ThisMinute
    ,P.parent_id
    ,P.ChildCount
HAVING
    COUNT(DISTINCT T.child_id) = P.ChildCount

--get results
SELECT
    parent_id
    ,CAST(CASE WHEN start_time = '1900-01-01' AND end_time = '1900-01-02 23:59' THEN NULL ELSE start_time END AS TIME) AS start_time
    ,CAST(CASE WHEN start_time = '1900-01-01' AND end_time = '1900-01-02 23:59' THEN NULL ELSE end_time END AS TIME) AS end_time
    ,valid
FROM
(
    SELECT
        P.parent_id
        ,MIN(ThisMinute) AS start_time
        ,MAX(ThisMinute) AS end_time
        ,CASE WHEN MAX(ThisMinute) IS NOT NULL THEN 1 ELSE 0 END AS valid 
    FROM
        @Parents P
    LEFT JOIN
        @SharedMinutes SM
        ON P.parent_id = SM.parent_id
    GROUP BY
        P.parent_id

) Results

You may find that the iterative algorithm you have outlined in your question would be more efficient. But I would use a WHILE loop instead of a cursor if you take that approach.

Finding the largest time overlap in T-SQL

Answers (2)

Related Questions