Pentium10
Pentium10

Reputation: 208012

Find the gap in ID column + select the prev/next date column

We have a table with ID autoincrement column with gaps. Each row has also a created date.

We are trying to find out for each gap ID (the missing ones) the prev/next date that is available in our table.

We already built a SQL that identifies the gaps (solution from here), so I have them in a table cached, but from here how to find the prev/next created date that covers the gap from the original table.

Input:

+----+------------+
| 84 | 1443728132 |
| 91 | 1443728489 |
| 93 | 1443729058 |
| 94 | 1443729200 |
+----+------------+

Output

+--------+------------+------------+
| gap_id |  prev_dt   |  next_dt   |
+--------+------------+------------+
|     85 | 1443728132 | 1443728489 |
|     86 | 1443728132 | 1443728489 |
|     87 | 1443728132 | 1443728489 |
|     88 | 1443728132 | 1443728489 |
|     89 | 1443728132 | 1443728489 |
|     90 | 1443728132 | 1443728489 |
|     92 | 1443728489 | 1443729058 |
+--------+------------+------------+

I ended up doing with Google BigQuery.

Upvotes: 0

Views: 506

Answers (3)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173161

For BigQuery Standard SQL

WITH yourTable AS (
SELECT 84 AS id, 1443728132 AS dt UNION ALL
SELECT 91 AS id, 1443728489 AS dt UNION ALL
SELECT 93 AS id, 1443729058 AS dt UNION ALL
SELECT 94 AS id, 1443729200 AS dt 
),
nums AS (
  SELECT num 
  FROM UNNEST(GENERATE_ARRAY((SELECT MIN(id) FROM YourTable), (SELECT MAX(id) FROM YourTable))) AS num
),
gaps AS (
  SELECT
    LAG (id) OVER (ORDER BY id) + 1 AS gap_start,
    id - 1 AS gap_end,
    LAG (dt) OVER (ORDER BY id) AS prev_dt,
    dt AS next_dt,
    CASE 
      WHEN LAG (id) OVER (ORDER BY id) + 1 <> id THEN 'Y'
    END AS is_gap
  FROM
    yourTable 
)
SELECT num as gap_id, prev_dt, next_dt
FROM gaps JOIN nums 
ON num BETWEEN gap_start AND gap_end
WHERE is_gap = 'Y'
ORDER BY num  

output:

gap_id     prev_dt     next_dt   
85      1443728132  1443728489   
86      1443728132  1443728489   
87      1443728132  1443728489   
88      1443728132  1443728489   
89      1443728132  1443728489   
90      1443728132  1443728489   
92      1443728489  1443729058   

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270773

I would recommend grouping the results into ranges:

select min(id + 1) as first_missing_id,
       (next_id - 1) as last_missing_id,
       next_dte
from (select t.*,
             lead(id) over (order by id) as next_id,
             lead(dte) over (order by id) as next_dte
      from t
     ) t
where next_id <> id + 1
group by next_id, next_dte;

Getting individual ids is tricky. After all, if you have 1, 1000000, 1000000000 then you could be generating lots of rows.

Upvotes: 2

David דודו Markovitz
David דודו Markovitz

Reputation: 44981

Should work on most databases other than MySQL

select      *

from       (select      lag (id) over (order by id) + 1                                 as gap_start
                       ,id - 1                                                          as gap_end
                       ,lag (dt) over (order by id)                                     as dt_before_gap
                       ,dt                                                              as dt_after_gap
                       ,case when lag (id) over (order by id) + 1 <> id then 'Y' end    as is_gap

            from        t
            ) t

where       is_gap = 'Y'
;

Upvotes: 1

Related Questions