HeonAle
HeonAle

Reputation: 155

Get Max Row by Column and Date Range

I'm trying to get a unique list of rows based on a column and a date range from another column. Here is example data:

id      CustomerNum    ServiceDate
------------------------------------------------
4406290 000000000066    2017-02-17 13:03:00.000
4406294 000000000066    2017-02-17 13:07:00.000
4406295 000000000066    2017-02-17 13:09:00.000
4406295 000000000066    2017-02-09 13:09:00.000
4406352 000000000066    2017-01-17 13:12:00.000
4406369 000000000066    2017-03-17 13:16:00.000
4582381 000000ABC102    2016-03-22 14:48:00.017
4589037 000000ABC102    2016-07-23 14:54:11.223
4625101 000009983148    2017-03-30 15:21:11.283
4625162 000005555398    2017-01-30 11:22:20.907
4625165 000005555398    2017-03-30 12:22:20.907
4625168 000005555398    2017-03-30 15:22:20.907

I need to group by CustomerNum and then by ServiceDate but only if the ServiceDate is within 7 days. I'd like to have the max (latest) row by grouping. So the results should be:

id      CustomerNum    ServiceDate
------------------------------------------------
4406295 000000000066    2017-02-17 13:09:00.000
4406295 000000000066    2017-02-09 13:09:00.000
4406352 000000000066    2017-01-17 13:12:00.000
4406369 000000000066    2017-03-17 13:16:00.000
4582381 000000ABC102    2016-03-22 14:48:00.017
4589037 000000ABC102    2016-07-23 14:54:11.223
4625101 000009983148    2017-03-30 15:21:11.283
4625162 000005555398    2017-01-30 11:22:20.907
4625168 000005555398    2017-03-30 15:22:20.907

I've tried the following:

WITH cte
AS (
SELECT *
    ,ROW_NUMBER() OVER (
        PARTITION BY CustomerNum ORDER BY ServiceDate DESC
        ) AS rn
FROM CustomerTransactions
)
SELECT *
FROM cte
WHERE rn = 1

but this gives me row numbers over all CustomerNum instead of starting row number over at 1 after date range has been met.

I know I'm missing something. Any thoughts? Thank you.

Upvotes: 0

Views: 54

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271151

For this problem, you want to use lag() and cumulative sum. Use lag to determine where each group starts, then cumulative sum to assign the groups:

select sum(case when prev_ServiceDate > dateadd(day, -7, ServiceDate) then 0 else 1 end) over
           (partition by CustomerNum order by ServiceDate) as grp
from (select ct.*, 
             lag(ServiceDate) over (partition by CustomerNum order by ServiceDate) as prev_ServiceDate
      from CustomerTransactions ct
     ) ct;

You can then summarize the groups using aggregation:

select CustomerNum, min(ServiceDate), max(ServiceDate)
from (select sum(case when prev_ServiceDate > dateadd(day, -7, ServiceDate) then 0 else 1 end) over
                 (partition by CustomerNum order by ServiceDate) as grp
      from (select ct.*, 
                   lag(ServiceDate) over (partition by CustomerNum order by ServiceDate) as prev_ServiceDate
            from CustomerTransactions ct
           ) ct
     ) ct
group by CustomerNum, grp

Upvotes: 2

Related Questions