Black
Black

Reputation: 4654

SQL: Date difference between last non-equal event

I have a table of the form below that has: user id, product activated, event that occurs and time that corresponds to the event that occurs.

UID | |Product | Event | Time
A      C1         F      2017-06-23
A      C2         S      2017-06-21
A      C1         S      2017-06-19
A      C1         S      2017-06-17
B      C3         F      2017-06-12
B      C3         S      2017-06-12
C      C2         F      2017-06-02
C      C2         F      2017-06-01

I would like to find the time difference from the previousS event to F current event for each user and product.

UID | |Product | Event | Time        | Days_Diff
A      C1         F      2017-06-23    4
A      C2         S      2017-06-21    NULL
A      C1         S      2017-06-19    NULL
A      C1         S      2017-06-17    NULL
B      C3         F      2017-06-12    0
B      C3         S      2017-06-12    NULL
C      C2         F      2017-06-02    NULL
C      C2         F      2017-06-01    NULL

I tried something like the below but it doesn't help me keep track of the last product and event

SELECT UID, Product, Event, Time,
       CASE 
       -- product is equal to last product
       WHEN Product = LAG(Product, 1) OVER (PARTITION BY UID, Product ORDER BY Time) 
       -- current event = F and last event = S
       AND Event = 'F' AND LAG(Event, 1) OVER (PARTITION BY UID, Product ORDER BY Time) = 'S' 
       -- subtract current time by the last time this product was activated
       THEN DATEDIFF('DAY', MAX(Time) OVER (PARTITION BY UID, Product ORDER BY Time 
                                   ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), Time) 
       END AS days_diff
FROM table

However this doesn't help me compare cases where a user has activated a product and has an S event not directly followed by an F event. E.g. a case as below

UID | |Product | Event | Time        | Days_Diff
A      C1         F      2017-06-23    4
A      C2         S      2017-06-21    NULL
A      C1         S      2017-06-19    NULL
A      C1         S      2017-06-17    NULL

How should I approach this problem?

Upvotes: 2

Views: 229

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271151

You seem to want the time from the earliest "S" to "F". If so:

SELECT UID, Product, Event, Time,
       (CASE WHEN Event = 'F'
             THEN DATEDIFF(DAY,
                           MIN(CASE WHEN Event = 'S' THEN Time END) 
                               OVER (PARTITION BY UID
                                     ORDER BY TIME
                                     ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
                                    ),
                           Time
                          )
        END) AS days_diff
FROM table;

Note: This uses Redshift's DATEDIFF() function. The first argument (as far as I know) is a datepart, not a string.

EDIT:

I see, you want the "S" on the same product. That is just a little different:

SELECT UID, Product, Event, Time,
       (CASE WHEN Event = 'F'
             THEN DATEDIFF(DAY,
                           MAX(CASE WHEN Event = 'S' THEN Time END)                                    
                               OVER (PARTITION BY UID
                                     ORDER BY TIME
                                     ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
                                    ),
                           Time
                          )
        END) AS days_diff
FROM table;

Upvotes: 2

Related Questions