Reputation: 1670
Sample data:
id rank onboarded_at applied_at
A 1 1/1/2018 1/2/2018
A 2 1/1/2018 2/3/2018
A 3 1/1/2018 3/4/2018
B 1 2/1/2018 null
B 2 2/1/2018 8/1/2018
C 1 3/1/2018 4/1/2018
C 2 3/1/2018 7/1/2018
C 3 3/1/2018 8/1/2018
I can calculate lapsed_now
if there are >12 weeks (ie. 84 days) between: 1) onboarded_at
and current_date
(if no applied_at
exists), 2) max(applied_at)
and current_date
.
It is calculated like so:
case when max(applied_at) over (partition by id) is null and current_date - 84 > onboarded_at::date then 1
when current_date - 84 > max(applied_at) over (partition by id)::date and onboarded_at < max(applied_at) over (partition by id) then 1 else 0 end lapsed_now
Output:
id rank onboarded_at applied_at lapsed_now
A 1 1/1/2018 1/2/2018 1
A 2 1/1/2018 2/3/2018 1
A 3 1/1/2018 3/4/2018 1
B 1 2/1/2018 null 0
B 2 2/1/2018 8/1/2018 0 --not lapsed now
C 1 3/1/2018 4/1/2018 0
C 2 3/1/2018 7/1/2018 0
C 3 3/1/2018 8/1/2018 0 --not lapsed now
However, I would like to similarly calculate an id
level lapsed_ever
column, where we add these 2 additional conditions to lapsed_now:
1) If it took >84 days to get first applied_at
(no matter if there is an applied_at
within 84 days later)
2) If it took >84 between ANY of applied_at
for the id (each applied has rank +1)
How can I do it? This would be the output- where B is lapsed_ever
because of how long it took to get first applied_at
and C is lapsed_ever
because of how long it took to get from rank=1
to rank=2
:
id rank onboarded_at applied_at lapsed_now lapsed_ever
A 1 1/1/2018 1/2/2018 1 0
A 2 1/1/2018 2/3/2018 1 0
A 3 1/1/2018 3/4/2018 1 0
B 1 2/1/2018 null 0 1 --lapsed
B 2 2/1/2018 8/1/2018 0 1
C 1 3/1/2018 4/1/2018 0 1
C 2 3/1/2018 8/1/2018 0 1 --lapsed
C 3 3/1/2018 9/1/2018 0 1
For condition #1 I think I can do:
case when datediff(day, onboarded_at, min(applied_at) over (partition by id))>84 then 1
But not sure what to do for #2
----EDIT---- As per @Jim's answer, it looks like this is my output:
id rank onboarded_at applied_at lapsed_now lapsed_date
A 1 1/1/2018 1/2/2018 1 (null)
A 2 1/1/2018 2/3/2018 1 (null)
A 3 1/1/2018 3/4/2018 1 (null)
B 1 2/1/2018 (null) 0 10/24/2018
B 2 2/1/2018 8/1/2018 0 10/24/2018
C 1 3/1/2018 4/1/2018 0 11/24/2018
C 2 3/1/2018 8/1/2018 0 11/24/2018
C 3 3/1/2018 9/1/2018 0 11/24/2018
However, it should be:
id rank onboarded_at applied_at lapsed_now lapsed_date
A 1 1/1/2018 1/2/2018 1 5/27/2018
A 2 1/1/2018 2/3/2018 1 5/27/2018
A 3 1/1/2018 3/4/2018 1 5/27/2018 (March 4 + 84)
B 1 2/1/2018 (null) 0 4/26/2018 (Feb 1 + 84)
B 2 2/1/2018 8/1/2018 0 4/26/2018
C 1 3/1/2018 4/1/2018 0 6/24/2018
C 2 3/1/2018 8/1/2018 0 6/24/2018 (Apr 1 + 84)
C 3 3/1/2018 9/1/2018 0 6/24/2018
Upvotes: 0
Views: 168
Reputation: 2528
It seems like the two cases you mentioned are effectively the same. You want to know if 84 days ever elapsed for each id. You can do that with the below:
CREATE TABLE dbo.Onboarded
(
id VARCHAR(10),
rank INTEGER,
onboarded_at DATE,
applied_at DATE
);
INSERT INTO dbo.Onboarded VALUES
('A',1,'20180101','20180102'),
('A',2,'20180101','20180203'),
('A',3,'20180101','20180304'),
('B',1,'20180201',NULL),
('B',2,'20180201','20180801'),
('C',1,'20180301','20180401'),
('C',2,'20180301','20180801'),
('C',3,'20180301','20180901');
SELECT onb.[id],
onb.[rank],
onb.[onboarded_at],
onb.[applied_at],
onb.[lapsed_now],
CASE WHEN [lapsed_now] = 1 OR [lapsed_previous] = 1
THEN 1
ELSE 0
END [lapsed_ever],
CASE WHEN lapsed_now = 1
THEN DATEADD(DAY, 84, [lapsed_now_date])
ELSE [min_applied_at_add_84]
END [lapsed_date]
FROM
(SELECT *,
CASE
WHEN DATEDIFF(DAY, onboarded_at, MIN(ISNULL(applied_at, onboarded_at)) over (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) >= 84
THEN 1
WHEN DATEDIFF(DAY, MAX(applied_at) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), GETDATE()) >= 84
THEN 1
ELSE 0
END [lapsed_now],
CASE
WHEN MAX(DATEDIFF(DAY, onboarded_at, ISNULL(applied_at, GETDATE()))) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) >= 84
THEN 1
ELSE 0
END [lapsed_previous],
MAX(applied_at) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) [lapsed_now_date],
DATEADD(DAY, 84, MIN(CASE WHEN applied_at IS NULL THEN onboarded_at ELSE applied_at END) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) [min_applied_at_add_84]
FROM dbo.Onboarded
) onb
Results:
id rank onboarded_at applied_at lapsed_now lapsed_ever lapsed_date
A 1 2018-01-01 2018-01-02 1 1 2018-05-27
A 2 2018-01-01 2018-02-03 1 1 2018-05-27
A 3 2018-01-01 2018-03-04 1 1 2018-05-27
B 1 2018-02-01 (null) 0 1 2018-04-26
B 2 2018-02-01 2018-08-01 0 1 2018-04-26
C 1 2018-03-01 2018-04-01 0 1 2018-06-24
C 2 2018-03-01 2018-08-01 0 1 2018-06-24
C 3 2018-03-01 2018-09-01 0 1 2018-06-24
The key is ensuring that your window function pays attention to all rows in each window, which is where the ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING comes in.
You can read more about how to use them here.
Upvotes: 2