Reputation: 4400
I have the transaction audit history as following. All audits start with a INIT
and may have a REVERSAL
and a MODIFICATION
. I need to find all the audits with INIT
-> REVERSAL
but no MODIFICATION
TRANSACTION_ID | ACTION | AMOUNT | AUDIT_TS |
---|---|---|---|
0 | INIT | 14 | 2022-07-10 14:49:17 |
111 | INIT | 10 | 2022-07-10 14:03:09 |
111 | REVERSAL | 10 | 2022-07-10 14:24:10 |
111 | MODIFICATION | 8 | 2022-07-10 14:49:11 |
222 | INIT | 12 | 2022-07-10 14:07:12 |
222 | REVERSAL | 12 | 2022-07-10 14:24:12 |
222 | MODIFICATION | 9 | 2022-07-10 14:43:13 |
333 | REVERSAL | 15 | 2022-07-10 14:26:14 |
333 | MODIFICATION | 14 | 2022-07-10 14:43:15 |
444 | INIT | 14 | 2022-07-10 14:42:16 |
555 | INIT | 5 | 2022-07-10 14:08:18 |
555 | REVERSAL | 5 | 2022-07-10 14:26:19 |
I tried using the Regex End Anchor ($) as following, but it also returns the audits that end with MODIFICATION
:
select
*
from audit_table
match_recognize(
partition by transaction_id
order by audit_ts
ALL ROWS PER MATCH
PATTERN (INIT REVERSAL$)
DEFINE
INIT as iff(ACTION='INIT',TRUE,FALSE),
REVERSAL as iff(ACTION='REVERSAL',TRUE,FALSE),
MODIFICATION as iff(ACTION='MODIFICATION',TRUE,FALSE)
);
Please advise.
Upvotes: 2
Views: 73
Reputation: 25903
Another why to think about this, is to just write the logic you want:
with data(TRANSACTION_ID, ACTION, AMOUNT, AUDIT_TS) as (
select
column1, column2, column3, try_to_timestamp(column4)
from values
(0 ,'INIT', 14, '2022-07-10 14:49:17'),
(111 ,'INIT', 10, '2022-07-10 14:03:09'),
(111 ,'REVERSAL', 10, '2022-07-10 14:24:10'),
(111 ,'MODIFICATION', 8, '2022-07-10 14:49:11'),
(222 ,'INIT', 12, '2022-07-10 14:07:12'),
(222 ,'REVERSAL', 12, '2022-07-10 14:24:12'),
(222 ,'MODIFICATION', 9, '2022-07-10 14:43:13'),
(333 ,'REVERSAL', 15, '2022-07-10 14:26:14'),
(333 ,'MODIFICATION', 14, '2022-07-10 14:43:15'),
(444 ,'INIT', 14, '2022-07-10 14:42:16'),
(555 ,'INIT', 5, '2022-07-10 14:08:18'),
(555 ,'REVERSAL', 5, '2022-07-10 14:26:19')
)
select
*
,first_value(ACTION) over(partition by TRANSACTION_ID order by AUDIT_TS) = 'INIT' as l1
,last_value(ACTION) over(partition by TRANSACTION_ID order by AUDIT_TS) = 'REVERSAL' as l2
,count(ACTION) over(partition by TRANSACTION_ID) = 2 as l3
,l1 and l2 and l3 logic_you_want
from data
order by 1,4;
TRANSACTION_ID | ACTION | AMOUNT | AUDIT_TS | L1 | L2 | L3 | LOGIC_YOU_WANT |
---|---|---|---|---|---|---|---|
0 | INIT | 14 | 2022-07-10 14:49:17.000 | TRUE | FALSE | FALSE | FALSE |
111 | INIT | 10 | 2022-07-10 14:03:09.000 | TRUE | FALSE | FALSE | FALSE |
111 | REVERSAL | 10 | 2022-07-10 14:24:10.000 | TRUE | FALSE | FALSE | FALSE |
111 | MODIFICATION | 8 | 2022-07-10 14:49:11.000 | TRUE | FALSE | FALSE | FALSE |
222 | INIT | 12 | 2022-07-10 14:07:12.000 | TRUE | FALSE | FALSE | FALSE |
222 | REVERSAL | 12 | 2022-07-10 14:24:12.000 | TRUE | FALSE | FALSE | FALSE |
222 | MODIFICATION | 9 | 2022-07-10 14:43:13.000 | TRUE | FALSE | FALSE | FALSE |
333 | REVERSAL | 15 | 2022-07-10 14:26:14.000 | FALSE | FALSE | TRUE | FALSE |
333 | MODIFICATION | 14 | 2022-07-10 14:43:15.000 | FALSE | FALSE | TRUE | FALSE |
444 | INIT | 14 | 2022-07-10 14:42:16.000 | TRUE | FALSE | FALSE | FALSE |
555 | INIT | 5 | 2022-07-10 14:08:18.000 | TRUE | TRUE | TRUE | TRUE |
555 | REVERSAL | 5 | 2022-07-10 14:26:19.000 | TRUE | TRUE | TRUE | TRUE |
thus:
select
*
from data
qualify first_value(ACTION) over(partition by TRANSACTION_ID order by AUDIT_TS) = 'INIT'
and last_value(ACTION) over(partition by TRANSACTION_ID order by AUDIT_TS) = 'REVERSAL'
and count(ACTION) over(partition by TRANSACTION_ID) = 2
order by 1,4;
gives:
TRANSACTION_ID | ACTION | AMOUNT | AUDIT_TS |
---|---|---|---|
555 | INIT | 5 | 2022-07-10 14:08:18.000 |
555 | REVERSAL | 5 | 2022-07-10 14:26:19.000 |
Upvotes: 1
Reputation: 175566
The pattern needs to be altered using quantifiers ^INIT{1} REVERSAL{1}$
:
select *
from audit_table
match_recognize(
partition by transaction_id
order by audit_ts
ALL ROWS PER MATCH
PATTERN (^INIT{1} REVERSAL{1}$)
DEFINE
INIT as ACTION='INIT',
REVERSAL as ACTION='REVERSAL',
MODIFICATION as ACTION='MODIFICATION'
)
ORDER BY transaction_id, audit_ts;
Snowflake supports boolean data type natively so:
INIT as iff(ACTION='INIT',TRUE,FALSE),
<=>
INIT as ACTION='INIT',
Output:
Upvotes: 2