Reputation: 87
I try to find the gaps in data generated at the same time.
Simplified table looks like this:
--------------------
| row | date |
--------------------
| 1 | 2017-01-01 |
| 2 | 2017-01-02 |
| 3 | 2017-01-03 |
| 4 | 2017-02-01 |
| 5 | 2017-02-04 |
The result of the query should look like this:
------------------------
| date | diff |
------------------------
| 2017-01-03 | 27 days |
2017-01-03 the gap lasting 27 days has started.
I wrote such a query and it works. However, it takes very long time on larger dataset (about 20k rows).
SELECT
t.date,
t.diff
FROM
(WITH sd AS
(SELECT *
FROM observation
WHERE nr_id='7810'
AND date BETWEEN '2014-01-01' AND '2014-12-31'
ORDER BY date ASC) SELECT c.date,
coalesce(n.date, NULL) - c.date AS diff
FROM sd AS c
LEFT JOIN sd AS n ON n.date =
(SELECT MIN(date)
FROM sd WHERE date > c.date)) AS t
WHERE t.diff > '6 days'
Does anyone have any other idea, how to write it more effectively ?
ANSWER (modified approach sent by Gordon Linoff):
SELECT * FROM(
SELECT t.c_date, t.next_date, t.next_date - t.c_date as diff FROM(
SELECT o.date as c_date,
lead(o.date) over (ORDER BY o.date ASC) AS next_date
FROM observation o
WHERE nr_id = '7810' and
date between '2012-01-01' and '2017-12-31') as t) as b
WHERE diff > '6 days'
Upvotes: 0
Views: 537
Reputation: 1269493
Use lead()
:
select o.date, (next_date - date) as diff
from (select o.*, lead(date) over (order by date) as next_date
from observation o
) o
where next_date > date + interval '6 day';
Postgres should make use of an index on observation(date)
.
You can add the where
clause to the subquery:
where nr_id = 7810 and
date between '2014-01-01' and '2014-12-31'
I am guessing that nr_id
is a number. If you use this version, then you want an index on observation(nr_id, date)
.
Upvotes: 2