Reputation: 749
I have this table in Redshift, and I'm trying to write a query for the following dataset. For those items such as row#3 which are 'renewal successes' and are preceded by a 'sub success', I want to flag them as 'is_first_renewal = true', BUT they might have been preceded by any number of 'RENEWAL Failures' before they succeeded, so I can't use the window function LAG for this scenario.
I also cannot filter out FAILURES as my query needs those.
id phone op ts pr status result is_first_renewal
1 3345 xxxx 01/01/2018 6:59 0 SUB SUCCESS
2 3345 xxxx 28/02/2018 6:59 RENEWAL FAILURE
3 3345 xxxx 01/03/2018 6:59 20 RENEWAL SUCCESS TRUE
4 3345 xxxx 02/04/2018 6:59 20 RENEWAL SUCCESS FALSE
My current query is this:
CASE
WHEN (status = 'RENEWAL' AND result = 'SUCCESS')
AND ( (last_value(CASE
WHEN pr = 0.0 AND result= 'SUCCESS' THEN
TRUE
END))
OVER (PARTITION BY phone
ORDER BY ts ASC
ROWS unbounded preceding) = 'SUB' ) THEN
TRUE
ELSE
FALSE
END AS is_first_renewal
FROM notifications
WHERE ((charge_status = 'SUCCESS')
OR (status ='RENEWAL'
AND result = 'FAILED'))
ORDER BY
op,
phone,
ts;
Thus, the objective is to find a way to get the LAG function to ignore those 'FAILURE' items... until it can find the preceding 'sub success' item. Any suggestions? Thanks in advance.
Upvotes: 2
Views: 929
Reputation: 522396
The approach which came to mind for me was to instead target the individual records which should be labelled as TRUE
. Consider this:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY phone ORDER BY ts) rn
FROM notifications n1
WHERE EXISTS (SELECT 1 FROM notifications n2 WHERE n1.phone = n2.phone AND
n2.ts < n1.ts AND n2.status = 'SUB' AND n2.result = 'SUCCESS') AND
n1.status = 'RENEWAL' AND n1.result = 'SUCCESS'
)
SELECT n1.*,
CASE WHEN n2.rn = 1 THEN 'TRUE'
WHEN n2.rn > 1 THEN 'FALSE' END AS is_first_renewal
FROM notifications n1
LEFT JOIN cte n2
ON n1.phone = n2.phone AND n1.ts = n2.ts;
This query seems to be working in the Postgres demo link given below.
Upvotes: 3