Reputation: 69
I want to flag a column based on interdependent conditions. My input data is as below
id | status | rnk |
---|---|---|
A | Open | 1 |
A | Delay | 2 |
A | In | 3 |
B | In | 1 |
B | Out | 2 |
B | Delay | 3 |
B | count | 4 |
C | In | 1 |
C | Close | 2 |
C | out | 3 |
D | Close | 1 |
D | Open | 2 |
D | Delay | 3 |
D | In | 4 |
My output should look like
id | status | rnk | flag |
---|---|---|---|
A | Open | 1 | N |
A | Delay | 2 | Y |
A | In | 3 | N |
B | In | 1 | N |
B | Out | 2 | N |
B | Delay | 3 | N |
B | count | 4 | N |
C | In | 1 | N |
C | Close | 2 | N |
C | out | 3 | N |
D | Close | 1 | N |
D | Open | 2 | N |
D | Delay | 3 | Y |
D | In | 4 | N |
Logic - if status column is anything other than Delay then the flag will be N. If the status column is Delay and if the status is either 'Open' or 'Close' for records which are having less rnk than that of Delay within the same ID then flag will be Y else N.
Example - for ID 'A' we have a status 'Delay' and its rank is 2, now we need to check if the status of A with rank < 2 is either 'Open' or 'Close' then flag 'Delay' to 'Y'
please note: rnk column is already populated in the table based on different logic
Below is the query I have tried, but I am getting flag 'N' for all the records,
SELECT
*,
CASE WHEN status != 'Delay' THEN 'N'
WHEN rnk < (COALESCE(MAX(CASE WHEN status = 'Delay' THEN rnk ELSE -1 END) OVER(PARTITION BY id)))
AND status IN ('Open','Close') THEN 'Y'
ELSE 'N'
END AS flag
FROM TABLE
Upvotes: 0
Views: 405
Reputation: 6741
Use a grouped Common Table expression with the maximum rank and row count of all rows with status Open or Close, and left-join the base table with it:
-- your input, don't use in real query...
WITH
indata(id,status,rnk) AS (
SELECT 'A','Open',1
UNION ALL SELECT 'A','Delay',2
UNION ALL SELECT 'A','In',3
UNION ALL SELECT 'B','In',1
UNION ALL SELECT 'B','Out',2
UNION ALL SELECT 'B','Delay',3
UNION ALL SELECT 'B','count',4
UNION ALL SELECT 'C','In',1
UNION ALL SELECT 'C','Close',2
UNION ALL SELECT 'C','out',3
UNION ALL SELECT 'D','Close',1
UNION ALL SELECT 'D','Open',2
UNION ALL SELECT 'D','Delay',3
UNION ALL SELECT 'D','In',4
)
-- input ends here, real query starts below
-- replace following comma with "WITH" ...
,
prev_stats AS (
SELECT
id
, MAX(rnk) AS rnk
, COUNT(*) AS num
FROM indata
WHERE status IN ('Open','Close')
GROUP BY id
)
SELECT
indata.*
, CASE
WHEN status <> 'Delay' THEN 'N'
ELSE
CASE
WHEN prev_stats.num > 0 THEN 'Y'
ELSE 'N'
END
END AS flag
FROM indata
LEFT JOIN prev_stats ON indata.id = prev_stats.id
AND indata.rnk > prev_stats.rnk
;
Result:
id | status | rnk | flag |
---|---|---|---|
A | Open | 1 | N |
A | Delay | 2 | Y |
A | In | 3 | N |
B | In | 1 | N |
B | Out | 2 | N |
B | Delay | 3 | N |
B | count | 4 | N |
C | In | 1 | N |
C | Close | 2 | N |
C | out | 3 | N |
D | Close | 1 | N |
D | Open | 2 | N |
D | Delay | 3 | Y |
D | In | 4 | N |
Upvotes: -1
Reputation: 173013
Consider below simple approach
select *, if(
status = 'Delay' and
countif(status in ('Open', 'Close')) over(partition by id order by rnk) > 0,
'Y', 'N') as flag
from your_table
if applied to sample data in your question - output is
Upvotes: 1
Reputation: 49375
A correlated subquery is here more help full
SELECT
*,
CASE WHEN status != 'Delay' THEN 'N'
WHEN Exists( SELECT 1 FROM Table1 ta1 WHERE ta1.id = t1.id
AND status IN ('Open','Close') AND ta1.rnk < t1.rnk) THEN 'Y'
ELSE 'N'
END AS flag
FROM Table1 t1
id | status | rnk | flag |
---|---|---|---|
A | Open | 1 | N |
A | Delay | 2 | Y |
A | In | 3 | N |
B | In | 1 | N |
B | Out | 2 | N |
B | Delay | 3 | N |
B | count | 4 | N |
C | In | 1 | N |
C | Close | 2 | N |
C | out | 3 | N |
D | Close | 1 | N |
D | Open | 2 | N |
D | Delay | 3 | Y |
D | In | 4 | N |
Upvotes: 1