EM0
EM0

Reputation: 6347

PostgreSQL query to detect overlapping time ranges

I have a table in PostgreSQL 9.2 that looks like this (simplified):

CREATE TABLE my_features
(
  id integer NOT NULL,
  feature_id integer NOT NULL,
  begin_time timestamp NOT NULL,
  end_time timestamp
)

For each feature_id there may be multiple rows with time ranges specified by begin_time/end_time. They may overlap, but this is relatively rare. I'm looking for a fast way to find all feature_ids that have/don't have any overlaps.

I tried to do this using window functions, like this:

SELECT feature_id, bool_or(end_time > lead(begin_time) OVER ts_win) OVER ts_win AS overlaps_any
FROM my_features
WINDOW ts_win AS (PARTITION BY feature_id ORDER BY begin_time)

... but this doesn't work:

ERROR:  window function calls cannot be nested

The algorithm is simple: order the rows for a given feature_id by begin_time and check if any end_time > the next begin_time (if any). I suspect there must be an easy way to do this, perhaps with tsrange functions, but can't seem to find it just now.

Upvotes: 21

Views: 22783

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

Here is an observation. If there are overlapping time periods for a feature, then at least one time period overlaps with the preceding one as defined by begin_time. (You can look at this the other way. If there are no such overlaps then there is always a gap between one time frame and the next and nothing overlaps.)

This leads to the following query for determining overlaps:

select f.feature_id
from (select f.feature_id,
             (case when lag(end_time) over (partition by feature_id order by begin_time) > begin_time
                   then 1 else 0
              end) as HasOverlap
      from my_features f
     ) f
group by f.feature_id
having max(HaxOverlap) = 1;

Upvotes: 4

user330315
user330315

Reputation:

This can indeed be done using range types.

The following selects all those rows that do have overlapping ranges:

select f1.*
from my_features f1
where exists (select 1
              from my_features f2
              where tsrange(f2.begin_time, f2.end_time, '[]') && tsrange(f1.begin_time, f1.end_time, '[]')
                and f2.feature_id = f1.feature_id
                and f2.id <> f1.id);

When you change the condition to NOT EXISTS you'll find those that don't have any overlapping ranges.

SQLFiddle example: http://sqlfiddle.com/#!15/40b1e/1

tsrange(f2.begin_time, f2.end_time, '[]') creates a range that includes the upper and lower bounds. You can also create ranges that exclude either one or both.

More details can be found in the manual:
http://www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-INCLUSIVITY

The && operator checks if the two ranges overlap: http://www.postgresql.org/docs/current/static/functions-range.html

(I just wish Oracle had something fancy like that...)

Upvotes: 40

Related Questions