Reputation: 6347
I have a table in PostgreSQL 9.2 that looks like this (simplified):
CREATE TABLE my_features
(
id integer NOT NULL,
feature_id integer NOT NULL,
begin_time timestamp NOT NULL,
end_time timestamp
)
For each feature_id there may be multiple rows with time ranges specified by begin_time/end_time. They may overlap, but this is relatively rare. I'm looking for a fast way to find all feature_ids that have/don't have any overlaps.
I tried to do this using window functions, like this:
SELECT feature_id, bool_or(end_time > lead(begin_time) OVER ts_win) OVER ts_win AS overlaps_any
FROM my_features
WINDOW ts_win AS (PARTITION BY feature_id ORDER BY begin_time)
... but this doesn't work:
ERROR: window function calls cannot be nested
The algorithm is simple: order the rows for a given feature_id by begin_time and check if any end_time > the next begin_time (if any). I suspect there must be an easy way to do this, perhaps with tsrange functions, but can't seem to find it just now.
Upvotes: 21
Views: 22783
Reputation: 1271003
Here is an observation. If there are overlapping time periods for a feature, then at least one time period overlaps with the preceding one as defined by begin_time
. (You can look at this the other way. If there are no such overlaps then there is always a gap between one time frame and the next and nothing overlaps.)
This leads to the following query for determining overlaps:
select f.feature_id
from (select f.feature_id,
(case when lag(end_time) over (partition by feature_id order by begin_time) > begin_time
then 1 else 0
end) as HasOverlap
from my_features f
) f
group by f.feature_id
having max(HaxOverlap) = 1;
Upvotes: 4
Reputation:
This can indeed be done using range types.
The following selects all those rows that do have overlapping ranges:
select f1.*
from my_features f1
where exists (select 1
from my_features f2
where tsrange(f2.begin_time, f2.end_time, '[]') && tsrange(f1.begin_time, f1.end_time, '[]')
and f2.feature_id = f1.feature_id
and f2.id <> f1.id);
When you change the condition to NOT EXISTS
you'll find those that don't have any overlapping ranges.
SQLFiddle example: http://sqlfiddle.com/#!15/40b1e/1
tsrange(f2.begin_time, f2.end_time, '[]')
creates a range that includes the upper and lower bounds. You can also create ranges that exclude either one or both.
More details can be found in the manual:
http://www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-INCLUSIVITY
The &&
operator checks if the two ranges overlap: http://www.postgresql.org/docs/current/static/functions-range.html
(I just wish Oracle had something fancy like that...)
Upvotes: 40