Reputation: 208012
We have a table with ID
autoincrement column with gaps. Each row has also a created
date.
We are trying to find out for each gap ID (the missing ones) the prev/next date that is available in our table.
We already built a SQL that identifies the gaps (solution from here), so I have them in a table cached, but from here how to find the prev/next created
date that covers the gap from the original table.
Input:
+----+------------+
| 84 | 1443728132 |
| 91 | 1443728489 |
| 93 | 1443729058 |
| 94 | 1443729200 |
+----+------------+
Output
+--------+------------+------------+
| gap_id | prev_dt | next_dt |
+--------+------------+------------+
| 85 | 1443728132 | 1443728489 |
| 86 | 1443728132 | 1443728489 |
| 87 | 1443728132 | 1443728489 |
| 88 | 1443728132 | 1443728489 |
| 89 | 1443728132 | 1443728489 |
| 90 | 1443728132 | 1443728489 |
| 92 | 1443728489 | 1443729058 |
+--------+------------+------------+
I ended up doing with Google BigQuery.
Upvotes: 0
Views: 506
Reputation: 173161
For BigQuery Standard SQL
WITH yourTable AS (
SELECT 84 AS id, 1443728132 AS dt UNION ALL
SELECT 91 AS id, 1443728489 AS dt UNION ALL
SELECT 93 AS id, 1443729058 AS dt UNION ALL
SELECT 94 AS id, 1443729200 AS dt
),
nums AS (
SELECT num
FROM UNNEST(GENERATE_ARRAY((SELECT MIN(id) FROM YourTable), (SELECT MAX(id) FROM YourTable))) AS num
),
gaps AS (
SELECT
LAG (id) OVER (ORDER BY id) + 1 AS gap_start,
id - 1 AS gap_end,
LAG (dt) OVER (ORDER BY id) AS prev_dt,
dt AS next_dt,
CASE
WHEN LAG (id) OVER (ORDER BY id) + 1 <> id THEN 'Y'
END AS is_gap
FROM
yourTable
)
SELECT num as gap_id, prev_dt, next_dt
FROM gaps JOIN nums
ON num BETWEEN gap_start AND gap_end
WHERE is_gap = 'Y'
ORDER BY num
output:
gap_id prev_dt next_dt
85 1443728132 1443728489
86 1443728132 1443728489
87 1443728132 1443728489
88 1443728132 1443728489
89 1443728132 1443728489
90 1443728132 1443728489
92 1443728489 1443729058
Upvotes: 1
Reputation: 1270773
I would recommend grouping the results into ranges:
select min(id + 1) as first_missing_id,
(next_id - 1) as last_missing_id,
next_dte
from (select t.*,
lead(id) over (order by id) as next_id,
lead(dte) over (order by id) as next_dte
from t
) t
where next_id <> id + 1
group by next_id, next_dte;
Getting individual ids is tricky. After all, if you have 1, 1000000, 1000000000 then you could be generating lots of rows.
Upvotes: 2
Reputation: 44981
Should work on most databases other than MySQL
select *
from (select lag (id) over (order by id) + 1 as gap_start
,id - 1 as gap_end
,lag (dt) over (order by id) as dt_before_gap
,dt as dt_after_gap
,case when lag (id) over (order by id) + 1 <> id then 'Y' end as is_gap
from t
) t
where is_gap = 'Y'
;
Upvotes: 1