user3461502
user3461502

Reputation: 323

How to check if column follows an order?

I would like to check if a set of data follows a specific order.

For example:

+------------+----------------+--------+
| ID (GUID)  |   StartDate    | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 |     22 |
| BB22001... | 8/17/2019 3:33 |     21 |
| 4D40B12... | 8/17/2019 3:47 |     21 |
| 3655125... | 8/17/2019 4:06 |     20 |
| 3456CD1... | 8/17/2019 4:22 |     20 |
| 38BAF92... | 8/17/2019 4:40 |     19 |
| E60CBE8... | 8/17/2019 5:09 |     19 |
| 5F2756B... | 8/17/2019 5:24 |     18 |
+------------+----------------+--------+

I would like to know if the NumCol has any missing descending number. It is ok if there are duplicated numbers (as above there are multiple 19s), but if there is a missing number, this is what I need to identified.

Now, this would be a problem: Since it goes from 22 -> 20 and it's missing 21.

+------------+----------------+--------+
| ID (GUID)  |   StartDate    | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 |     22 ||
| 3655125... | 8/17/2019 4:06 |     20 |
| 3456CD1... | 8/17/2019 4:22 |     20 |
| 38BAF92... | 8/17/2019 4:40 |     19 |
| E60CBE8... | 8/17/2019 5:09 |     19 |
| 5F2756B... | 8/17/2019 5:24 |     18 |
+------------+----------------+--------+

Is there a way to check this?

Upvotes: 1

Views: 71

Answers (1)

Greg Pavlik
Greg Pavlik

Reputation: 11046

You can use the IFF function and LAG to do something like this:

create or replace table T1 as 
select 
COLUMN1::string as "ID",
COLUMN2::String as StartDate,
COLUMN3::int as NUMCOL
from (values
('CEE8C17...','8/17/2019 3:11','22'),
('3655125...','8/17/2019 4:06','20'),
('3456CD1...','8/17/2019 4:22','20'),
('38BAF92...','8/17/2019 4:40','19'),
('E60CBE8...','8/17/2019 5:09','19'),
('5F2756B...','8/17/2019 5:24','18')
);


select *, 
iff(lag(NUMCOL) over (order by NUMCOL desc) - NUMCOL > 1, 'MISSING VALUES', '') 
as MISSING_VALUES_CHECK from T1
ID STARTDATE NUMCOL MISSING_VALUES_CHECK
CEE8C17... 8/17/2019 3:11 22
3655125... 8/17/2019 4:06 20 MISSING VALUES
3456CD1... 8/17/2019 4:22 20
38BAF92... 8/17/2019 4:40 19
E60CBE8... 8/17/2019 5:09 19
5F2756B... 8/17/2019 5:24 18

If you want some other output other than 'MISSING VALUES' you can change that fork of the IFF function or handle it some other way. The key thing is the check the lag() vs the NUMCOL to see if the difference is >1.

To perform this check within individual ID values, you can add a partition by clause to the lag window function:

select *, 
iff(lag(NUMCOL) over (partition by ID order by NUMCOL desc) - NUMCOL > 1, 'MISSING VALUES', '') 
as MISSING_VALUES_CHECK from T1

Upvotes: 2

Related Questions