how to get non-overlapping distinct intervals in postgresql table?

Question

the table has user sessions, i need distinct non overlapping sessions printed.

CREATE TABLE SESSIONS(
            id serial NOT NULL PRIMARY KEY, 
            ctn INT NOT NULL, 
            day DATE NOT NULL,
            f_time TIME(0) NOT NULL,
            l_time TIME(0) NOT  NULL
        );     
    INSERT INTO SESSIONS(id, ctn, day, f_time, l_time)
    VALUES
    (1, 707, '2019-06-18', '10:48:25', '10:56:17'),
    (2, 707, '2019-06-18', '10:48:33', '10:56:17'),
    (3, 707, '2019-06-18', '10:53:17', '11:00:49'),
    (4, 707, '2019-06-18', '10:54:31', '10:57:37'),
    (5, 707, '2019-06-18', '11:03:59', '11:10:39'),
    (6, 707, '2019-06-18', '11:04:41', '11:08:02'),
    (7, 707, '2019-06-18', '11:11:04', '11:19:39');

sql fiddle

My table looks like this:

id  ctn day         f_time      l_time
1   707 2019-06-18  10:48:25    10:56:17
2   707 2019-06-18  10:48:33    10:56:17
3   707 2019-06-18  10:53:17    11:00:49
4   707 2019-06-18  10:54:31    10:57:37
5   707 2019-06-18  11:03:59    11:10:39
6   707 2019-06-18  11:04:41    11:08:02
7   707 2019-06-18  11:11:04    11:19:39

Now i need distinct non-overlapping user sessions, so it should give me

1.  start_time: 10:48:25  end_time: 11:00:49  duration: 12min,24 sec
2.  start_time: 11:03:59  end_time: 11:10:39  duration: 6min,40 sec
3.  start_time: 11:11:04  end_time: 11:19:33  duration: 8min,29 sec

GMB · Accepted Answer

This is a gaps-and-islands problem. Here is an approach using window functions:

select
    ctn,
    min(f_ts) start_ts,
    max(l_ts) end_ts,
    max(l_ts) - min(f_ts) duration
from (
    select 
        s.*,
        count(*) filter(where f_ts > lag_l_ts) over(partition by ctn order by f_ts) grp
    from (
        select
            s.*,
            lag(l_ts) over(partition by ctn order by f_ts) lag_l_ts
        from (
            select
                s.*,
                (day + l_time)::timestamp l_ts,
                (day + f_time)::timestamp f_ts
            from sessions s
        ) s
    ) s
) s
group by ctn, grp
order by ctn, start_ts

The query works as follows:

first we rebuild proper timestamps from the date and time parts: storing data this way makes it uneasy to manipulate it (and it does not allow sessions to spread over different days)
one data is normalized, we use lag() to get the end timestamp of the "previous" row
we can then build groups of "adjacent" records: every time the start timestamp is greater than the previous end timestamp, a new group starts
the final step is aggregation

Demo on DB Fiddle:

ctn | start_ts            | end_ts              | duration
--: | :------------------ | :------------------ | :-------
707 | 2019-06-18 10:48:25 | 2019-06-18 11:00:49 | 00:12:24
707 | 2019-06-18 11:03:59 | 2019-06-18 11:10:39 | 00:06:40
707 | 2019-06-18 11:11:04 | 2019-06-18 11:19:39 | 00:08:35

how to get non-overlapping distinct intervals in postgresql table?

Answers (1)

Related Questions