Reputation: 4654
I have a table of the form below that has: user id
, product
activated, event
that occurs and time
that corresponds to the event that occurs.
UID | |Product | Event | Time
A C1 F 2017-06-23
A C2 S 2017-06-21
A C1 S 2017-06-19
A C1 S 2017-06-17
B C3 F 2017-06-12
B C3 S 2017-06-12
C C2 F 2017-06-02
C C2 F 2017-06-01
I would like to find the time difference from the previousS
event to F
current event for each user and product.
UID | |Product | Event | Time | Days_Diff
A C1 F 2017-06-23 4
A C2 S 2017-06-21 NULL
A C1 S 2017-06-19 NULL
A C1 S 2017-06-17 NULL
B C3 F 2017-06-12 0
B C3 S 2017-06-12 NULL
C C2 F 2017-06-02 NULL
C C2 F 2017-06-01 NULL
I tried something like the below but it doesn't help me keep track of the last product and event
SELECT UID, Product, Event, Time,
CASE
-- product is equal to last product
WHEN Product = LAG(Product, 1) OVER (PARTITION BY UID, Product ORDER BY Time)
-- current event = F and last event = S
AND Event = 'F' AND LAG(Event, 1) OVER (PARTITION BY UID, Product ORDER BY Time) = 'S'
-- subtract current time by the last time this product was activated
THEN DATEDIFF('DAY', MAX(Time) OVER (PARTITION BY UID, Product ORDER BY Time
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), Time)
END AS days_diff
FROM table
However this doesn't help me compare cases where a user has activated a product and has an S
event not directly followed by an F
event. E.g. a case as below
UID | |Product | Event | Time | Days_Diff
A C1 F 2017-06-23 4
A C2 S 2017-06-21 NULL
A C1 S 2017-06-19 NULL
A C1 S 2017-06-17 NULL
How should I approach this problem?
Upvotes: 2
Views: 229
Reputation: 1271151
You seem to want the time from the earliest "S" to "F". If so:
SELECT UID, Product, Event, Time,
(CASE WHEN Event = 'F'
THEN DATEDIFF(DAY,
MIN(CASE WHEN Event = 'S' THEN Time END)
OVER (PARTITION BY UID
ORDER BY TIME
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
),
Time
)
END) AS days_diff
FROM table;
Note: This uses Redshift's DATEDIFF()
function. The first argument (as far as I know) is a datepart, not a string.
EDIT:
I see, you want the "S" on the same product. That is just a little different:
SELECT UID, Product, Event, Time,
(CASE WHEN Event = 'F'
THEN DATEDIFF(DAY,
MAX(CASE WHEN Event = 'S' THEN Time END)
OVER (PARTITION BY UID
ORDER BY TIME
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
),
Time
)
END) AS days_diff
FROM table;
Upvotes: 2