Reputation: 7699
I have a table invoices
with a field invoice_number
. This is what happens when i execute select invoice_number from invoice
:
invoice_number
--------------
1
2
3
5
6
10
11
I want a SQL that gives me the following result:
gap_start | gap_end
4 | 4
7 | 9
How can i write a SQL to perform such query? I am using PostgreSQL.
Upvotes: 17
Views: 7108
Reputation:
The name of this problem is the "Gaps and Islands problem" which can be done with any modern SQL, using window functions:
select invoice_number + 1 as gap_start,
next_nr - 1 as gap_end
from (
select invoice_number,
lead(invoice_number) over (order by invoice_number) as next_nr
from invoices
) nr
where invoice_number + 1 <> next_nr;
SQLFiddle: http://sqlfiddle.com/#!15/1e807/1
Walkthrough example here using row_number over partition and interval: Postgres Consecutive Days, gaps and islands, Tabibitosan
Upvotes: 37
Reputation: 10840
We can use a simpler technique to get all missing values first, by joining on a generated sequence column like so:
select series
from generate_series(1, 11, 1) series
left join invoices on series = invoices.invoice_number
where invoice_number is null;
This gets us the series of missing numbers, which can be useful on it's own in some cases.
To get the gap start/end range, we can instead join the source table with itself.
select invoices.invoice_number + 1 as start,
min(fr.invoice_number) - 1 as stop
from invoices
left join invoices r on invoices.invoice_number = r.invoice_number - 1
left join invoices fr on invoices.invoice_number < fr.invoice_number
where r.invoice_number is null
and fr.invoice_number is not null
group by invoices.invoice_number,
r.invoice_number;
dbfiddle: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=32c5f3c021b0f1a876305a2bd3afafc9
This is probably less optimised than the above solutions, but could be useful in SQL servers that don't support lead()
function perhaps.
Full credit goes to this excellent page in SILOTA docs: http://www.silota.com/docs/recipes/sql-gap-analysis-missing-values-sequence.html
I highly recommend reading it, as it explains the solution step by step.
Upvotes: 6
Reputation: 45760
I found another query:
select invoice_number + lag gap_start,
invoice_number + lead - 1 gap_end
from (select invoice_number,
invoice_number - lag(invoice_number) over w lag,
lead(invoice_number) over w - invoice_number lead
from invoices window w as (order by invoice_number)) x
where lag = 1 and lead > 1;
Upvotes: 1