user8834780
user8834780

Reputation: 1670

Check datediff criteria between every value per user

Sample data:

id   rank   onboarded_at   applied_at   
A    1      1/1/2018       1/2/2018     
A    2      1/1/2018       2/3/2018     
A    3      1/1/2018       3/4/2018     
B    1      2/1/2018       null        
B    2      2/1/2018       8/1/2018     
C    1      3/1/2018       4/1/2018     
C    2      3/1/2018       7/1/2018     
C    3      3/1/2018       8/1/2018    

I can calculate lapsed_now if there are >12 weeks (ie. 84 days) between: 1) onboarded_at and current_date (if no applied_at exists), 2) max(applied_at) and current_date.

It is calculated like so:

case when max(applied_at) over (partition by id) is null and current_date - 84 > onboarded_at::date then 1
when current_date - 84 > max(applied_at) over (partition by id)::date and onboarded_at < max(applied_at) over (partition by id) then 1 else 0 end lapsed_now

Output:

id   rank   onboarded_at   applied_at   lapsed_now
A    1      1/1/2018       1/2/2018     1
A    2      1/1/2018       2/3/2018     1
A    3      1/1/2018       3/4/2018     1 
B    1      2/1/2018       null         0
B    2      2/1/2018       8/1/2018     0 --not lapsed now
C    1      3/1/2018       4/1/2018     0
C    2      3/1/2018       7/1/2018     0
C    3      3/1/2018       8/1/2018     0 --not lapsed now

However, I would like to similarly calculate an id level lapsed_ever column, where we add these 2 additional conditions to lapsed_now:

1) If it took >84 days to get first applied_at (no matter if there is an applied_atwithin 84 days later)

2) If it took >84 between ANY of applied_at for the id (each applied has rank +1)

How can I do it? This would be the output- where B is lapsed_ever because of how long it took to get first applied_at and C is lapsed_ever because of how long it took to get from rank=1 to rank=2:

id   rank   onboarded_at   applied_at   lapsed_now   lapsed_ever
A    1      1/1/2018       1/2/2018     1            0           
A    2      1/1/2018       2/3/2018     1            0
A    3      1/1/2018       3/4/2018     1            0
B    1      2/1/2018       null         0            1 --lapsed
B    2      2/1/2018       8/1/2018     0            1
C    1      3/1/2018       4/1/2018     0            1
C    2      3/1/2018       8/1/2018     0            1 --lapsed
C    3      3/1/2018       9/1/2018     0            1

For condition #1 I think I can do: case when datediff(day, onboarded_at, min(applied_at) over (partition by id))>84 then 1 But not sure what to do for #2

----EDIT---- As per @Jim's answer, it looks like this is my output:

id  rank    onboarded_at    applied_at  lapsed_now  lapsed_date
A   1       1/1/2018        1/2/2018    1           (null)
A   2       1/1/2018        2/3/2018    1           (null)
A   3       1/1/2018        3/4/2018    1           (null)
B   1       2/1/2018        (null)      0           10/24/2018 
B   2       2/1/2018        8/1/2018    0           10/24/2018 
C   1       3/1/2018        4/1/2018    0           11/24/2018 
C   2       3/1/2018        8/1/2018    0           11/24/2018 
C   3       3/1/2018        9/1/2018    0           11/24/2018 

However, it should be:

id  rank    onboarded_at    applied_at  lapsed_now  lapsed_date
A   1       1/1/2018        1/2/2018    1           5/27/2018
A   2       1/1/2018        2/3/2018    1           5/27/2018
A   3       1/1/2018        3/4/2018    1           5/27/2018 (March 4 + 84)
B   1       2/1/2018        (null)      0           4/26/2018 (Feb 1 + 84)
B   2       2/1/2018        8/1/2018    0           4/26/2018
C   1       3/1/2018        4/1/2018    0           6/24/2018 
C   2       3/1/2018        8/1/2018    0           6/24/2018 (Apr 1 + 84)
C   3       3/1/2018        9/1/2018    0           6/24/2018 

Upvotes: 0

Views: 168

Answers (1)

Jim Jimson
Jim Jimson

Reputation: 2528

It seems like the two cases you mentioned are effectively the same. You want to know if 84 days ever elapsed for each id. You can do that with the below:

CREATE TABLE dbo.Onboarded
(
  id VARCHAR(10),
  rank INTEGER,
  onboarded_at DATE,
  applied_at DATE
  );

INSERT INTO dbo.Onboarded VALUES
('A',1,'20180101','20180102'),
('A',2,'20180101','20180203'),
('A',3,'20180101','20180304'),
('B',1,'20180201',NULL),
('B',2,'20180201','20180801'),
('C',1,'20180301','20180401'),
('C',2,'20180301','20180801'),
('C',3,'20180301','20180901');

SELECT onb.[id],
onb.[rank],
onb.[onboarded_at],
onb.[applied_at],
onb.[lapsed_now],
CASE WHEN [lapsed_now] = 1 OR [lapsed_previous] = 1
    THEN 1
    ELSE 0
END [lapsed_ever],
CASE WHEN lapsed_now = 1
    THEN DATEADD(DAY, 84, [lapsed_now_date])
    ELSE [min_applied_at_add_84]
END [lapsed_date]
FROM
(SELECT *,
CASE
    WHEN DATEDIFF(DAY, onboarded_at, MIN(ISNULL(applied_at, onboarded_at)) over (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) >= 84
        THEN 1
    WHEN DATEDIFF(DAY, MAX(applied_at) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), GETDATE()) >= 84
        THEN 1
    ELSE 0
END [lapsed_now],
CASE
    WHEN MAX(DATEDIFF(DAY, onboarded_at, ISNULL(applied_at, GETDATE()))) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) >= 84
        THEN 1
    ELSE 0
END [lapsed_previous],
MAX(applied_at) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) [lapsed_now_date],
DATEADD(DAY, 84, MIN(CASE WHEN applied_at IS NULL THEN onboarded_at ELSE applied_at END) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) [min_applied_at_add_84]
FROM dbo.Onboarded
) onb

Results:

id  rank    onboarded_at    applied_at  lapsed_now  lapsed_ever lapsed_date
A   1       2018-01-01      2018-01-02  1           1           2018-05-27
A   2       2018-01-01      2018-02-03  1           1           2018-05-27
A   3       2018-01-01      2018-03-04  1           1           2018-05-27
B   1       2018-02-01      (null)      0           1           2018-04-26
B   2       2018-02-01      2018-08-01  0           1           2018-04-26
C   1       2018-03-01      2018-04-01  0           1           2018-06-24
C   2       2018-03-01      2018-08-01  0           1           2018-06-24
C   3       2018-03-01      2018-09-01  0           1           2018-06-24

The key is ensuring that your window function pays attention to all rows in each window, which is where the ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING comes in.

You can read more about how to use them here.

Upvotes: 2

Related Questions