Reputation: 323
I would like to check if a set of data follows a specific order.
For example:
+------------+----------------+--------+
| ID (GUID) | StartDate | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 | 22 |
| BB22001... | 8/17/2019 3:33 | 21 |
| 4D40B12... | 8/17/2019 3:47 | 21 |
| 3655125... | 8/17/2019 4:06 | 20 |
| 3456CD1... | 8/17/2019 4:22 | 20 |
| 38BAF92... | 8/17/2019 4:40 | 19 |
| E60CBE8... | 8/17/2019 5:09 | 19 |
| 5F2756B... | 8/17/2019 5:24 | 18 |
+------------+----------------+--------+
I would like to know if the NumCol has any missing descending number. It is ok if there are duplicated numbers (as above there are multiple 19s), but if there is a missing number, this is what I need to identified.
Now, this would be a problem: Since it goes from 22 -> 20 and it's missing 21.
+------------+----------------+--------+
| ID (GUID) | StartDate | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 | 22 ||
| 3655125... | 8/17/2019 4:06 | 20 |
| 3456CD1... | 8/17/2019 4:22 | 20 |
| 38BAF92... | 8/17/2019 4:40 | 19 |
| E60CBE8... | 8/17/2019 5:09 | 19 |
| 5F2756B... | 8/17/2019 5:24 | 18 |
+------------+----------------+--------+
Is there a way to check this?
Upvotes: 1
Views: 71
Reputation: 11046
You can use the IFF function and LAG to do something like this:
create or replace table T1 as
select
COLUMN1::string as "ID",
COLUMN2::String as StartDate,
COLUMN3::int as NUMCOL
from (values
('CEE8C17...','8/17/2019 3:11','22'),
('3655125...','8/17/2019 4:06','20'),
('3456CD1...','8/17/2019 4:22','20'),
('38BAF92...','8/17/2019 4:40','19'),
('E60CBE8...','8/17/2019 5:09','19'),
('5F2756B...','8/17/2019 5:24','18')
);
select *,
iff(lag(NUMCOL) over (order by NUMCOL desc) - NUMCOL > 1, 'MISSING VALUES', '')
as MISSING_VALUES_CHECK from T1
ID | STARTDATE | NUMCOL | MISSING_VALUES_CHECK |
---|---|---|---|
CEE8C17... | 8/17/2019 3:11 | 22 | |
3655125... | 8/17/2019 4:06 | 20 | MISSING VALUES |
3456CD1... | 8/17/2019 4:22 | 20 | |
38BAF92... | 8/17/2019 4:40 | 19 | |
E60CBE8... | 8/17/2019 5:09 | 19 | |
5F2756B... | 8/17/2019 5:24 | 18 |
If you want some other output other than 'MISSING VALUES' you can change that fork of the IFF function or handle it some other way. The key thing is the check the lag() vs the NUMCOL to see if the difference is >1.
To perform this check within individual ID values, you can add a partition by clause to the lag window function:
select *,
iff(lag(NUMCOL) over (partition by ID order by NUMCOL desc) - NUMCOL > 1, 'MISSING VALUES', '')
as MISSING_VALUES_CHECK from T1
Upvotes: 2