Calculate time difference of rows which match condition in PostgreSQL

Question

I have a PostgreSQL table where I need to calculate the time interval in which a name is in the state 1. A name can be in this state multiple times and I need the time of each individual interval. I am using a temp table at the moment in which I save the start and endpoints of an interval and then calculate the time difference of it. But I don't like this solution and think there must be a better way of doing it. I hope to find some SQL Pros out here who can show me some magical and easy solution.

Here is how the table looks like:

|name   |state |time                      |
|-------|------|--------------------------|
|one    | 1    |'2020-11-11 01:00:02.5+01'| (start of first interval)
|one    | 1    |'2020-11-11 01:00:04.5+01'| (end of first interval) = 2 seconds
|one    | 0    |'2020-11-11 01:00:05.0+01'|
|one    | 0    |'2020-11-11 01:00:05.5+01'|
|one    | 1    |'2020-11-11 01:00:10.5+01'| (start of second interval)
|one    | 1    |'2020-11-11 01:00:11.5+01'| 
|one    | 1    |'2020-11-11 01:00:12.5+01'| (end of second interval) = 2 seconds
|two    | 0    |'2020-11-11 01:00:13.0+01'|
|two    | 0    |'2020-11-11 01:00:14.5+01'|
|two    | 1    |'2020-11-11 01:00:15.0+01'| (start of third interval)
|two    | 1    |'2020-11-11 01:00:15.5+01'| (end of third interval) = 0.5 seconds
|two    | 0    |'2020-11-11 01:00:16.5+01'|

SQL script for the sample table:

CREATE TABLE intervals(
    name char(10),
    state integer,
    time timestamptz
);

INSERT INTO intervals(name, state, time) VALUES
    ('one', 1, '2020-11-11 01:00:02.5+01'),
    ('one', 1, '2020-11-11 01:00:04.5+01'),
    ('one', 0, '2020-11-11 01:00:05.0+01'),
    ('one', 0, '2020-11-11 01:00:05.5+01'),
    ('one', 1, '2020-11-11 01:00:10.5+01'),
    ('one', 1, '2020-11-11 01:00:11.5+01'),
    ('one', 1, '2020-11-11 01:00:12.5+01'),
    ('two', 0, '2020-11-11 01:00:13.0+01'),
    ('two', 0, '2020-11-11 01:00:14.5+01'),
    ('two', 1, '2020-11-11 01:00:15.0+01'),
    ('two', 1, '2020-11-11 01:00:15.5+01'),
    ('two', 0, '2020-11-11 01:00:16.5+01');

Gordon Linoff · Accepted Answer

This is a type of gaps-and-islands problem. In this case, the difference of row numbers should do what you want:

select name, min(time), max(time),
       max(time) - min(time) as duration
from (select i.*,
             row_number() over (partition by name order by time) as seqnum,
             row_number() over (partition by name, state order by time) as seqnum_2
      from intervals i
     ) i
where state = 1
group by name, (seqnum - seqnum_2), state;

Here is a db<>fiddle.

The logic on the difference of row numbers is a little tricky to explain. If you run the subquery, you will see how the difference of row numbers on rows with the same name and adjacent values of state is constant. The aggregation just aggregates by the difference, which is constant for them.

Calculate time difference of rows which match condition in PostgreSQL

Answers (1)

Related Questions