Reputation: 155
I'm trying to get a unique list of rows based on a column and a date range from another column. Here is example data:
id CustomerNum ServiceDate
------------------------------------------------
4406290 000000000066 2017-02-17 13:03:00.000
4406294 000000000066 2017-02-17 13:07:00.000
4406295 000000000066 2017-02-17 13:09:00.000
4406295 000000000066 2017-02-09 13:09:00.000
4406352 000000000066 2017-01-17 13:12:00.000
4406369 000000000066 2017-03-17 13:16:00.000
4582381 000000ABC102 2016-03-22 14:48:00.017
4589037 000000ABC102 2016-07-23 14:54:11.223
4625101 000009983148 2017-03-30 15:21:11.283
4625162 000005555398 2017-01-30 11:22:20.907
4625165 000005555398 2017-03-30 12:22:20.907
4625168 000005555398 2017-03-30 15:22:20.907
I need to group by CustomerNum and then by ServiceDate but only if the ServiceDate is within 7 days. I'd like to have the max (latest) row by grouping. So the results should be:
id CustomerNum ServiceDate
------------------------------------------------
4406295 000000000066 2017-02-17 13:09:00.000
4406295 000000000066 2017-02-09 13:09:00.000
4406352 000000000066 2017-01-17 13:12:00.000
4406369 000000000066 2017-03-17 13:16:00.000
4582381 000000ABC102 2016-03-22 14:48:00.017
4589037 000000ABC102 2016-07-23 14:54:11.223
4625101 000009983148 2017-03-30 15:21:11.283
4625162 000005555398 2017-01-30 11:22:20.907
4625168 000005555398 2017-03-30 15:22:20.907
I've tried the following:
WITH cte
AS (
SELECT *
,ROW_NUMBER() OVER (
PARTITION BY CustomerNum ORDER BY ServiceDate DESC
) AS rn
FROM CustomerTransactions
)
SELECT *
FROM cte
WHERE rn = 1
but this gives me row numbers over all CustomerNum instead of starting row number over at 1 after date range has been met.
I know I'm missing something. Any thoughts? Thank you.
Upvotes: 0
Views: 54
Reputation: 1271151
For this problem, you want to use lag()
and cumulative sum. Use lag to determine where each group starts, then cumulative sum to assign the groups:
select sum(case when prev_ServiceDate > dateadd(day, -7, ServiceDate) then 0 else 1 end) over
(partition by CustomerNum order by ServiceDate) as grp
from (select ct.*,
lag(ServiceDate) over (partition by CustomerNum order by ServiceDate) as prev_ServiceDate
from CustomerTransactions ct
) ct;
You can then summarize the groups using aggregation:
select CustomerNum, min(ServiceDate), max(ServiceDate)
from (select sum(case when prev_ServiceDate > dateadd(day, -7, ServiceDate) then 0 else 1 end) over
(partition by CustomerNum order by ServiceDate) as grp
from (select ct.*,
lag(ServiceDate) over (partition by CustomerNum order by ServiceDate) as prev_ServiceDate
from CustomerTransactions ct
) ct
) ct
group by CustomerNum, grp
Upvotes: 2