yololo
yololo

Reputation: 319

SQL - get counts based on rolling window per unique id

I'm working with a table that has an id and date column. For each id, there's a 90-day window where multiple transactions can be made. The 90-day window starts when the first transaction is made and the clock resets once the 90 days are over. When the new 90-day window begins triggered by a new transaction I want to start the count from the beginning at one. I would like to generate something like this with the two additional columns (window and count) in SQL:

id      date        window  count   
name1   7/7/2019    first   1
name1   12/31/2019  second  1
name1   1/23/2020   second  2
name1   1/23/2020   second  3
name1   2/12/2020   second  4 
name1   4/1/2020    third   1
name2   6/30/2019   first   1
name2   8/14/2019   first   2 

I think getting the rank of the window can be done with a CASE statement and MIN(date) OVER (PARTITION BY id). This is what I have in mind for that:

CASE WHEN MIN(date) OVER (PARTITION BY id) THEN 'first' 
WHEN DATEDIFF(day, date, MIN(date) OVER (PARTITION BY id)) <= 90 THEN 'first'
WHEN DATEDIFF(day, date, MIN(date) OVER (PARTITION BY id)) > 90 AND DATEDIFF(day, date, MIN(date) OVER (PARTITION BY id)) <= 180 THEN 'third'
WHEN DATEDIFF(day, date, MIN(date) OVER (PARTITION BY id)) > 180 AND DATEDIFF(day, date, MIN(date) OVER (PARTITION BY id)) <= 270 THEN 'fourth' 
ELSE NULL END

And incrementing the counts within the windows would be ROW_NUMBER() OVER (PARTITION BY id, window)?

Upvotes: 1

Views: 1231

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269743

GMB is totally correct that a recursive CTE is needed. I offer this as an alternative form for two reasons. First, because it uses SQL Server syntax, which appears to be the database being used in the question. Second, because it directly calculates window and count without window functions:

with t as (
      select t.*, row_number() over (partition by id order by date) as seqnum
      from tbl t
     ),
     cte as (
      select t.id, t.date, dateadd(day, 90, t.date) as window_end, 1 as window, 1 as count, seqnum
      from t
      where seqnum = 1
      union all
      select t.id, t.date,
             (case when t.date > cte.window_end then dateadd(day, 90, t.date)
                   else cte.window_end
              end) as window_end,
             (case when t.date > cte.window_end then window + 1 else window end) as window,
             (case when t.date > cte.window_end then 1 else cte.count + 1 end) as count,
             t.seqnum
      from cte join
           t
           on t.id = cte.id and
              t.seqnum = cte.seqnum + 1
     )
select id, date, window, count
from cte
order by 1, 2;

Here is a db<>fiddle.

Upvotes: 1

GMB
GMB

Reputation: 222462

You cannot solve this problem with window functions only. You need to iterate through the dataset, which can be done with a recursive query:

with 
    tab as (
        select t.*, row_number() over(partition by id order by date) rn
        from mytable t
    )
    cte as (
        select id, date, rn, date date0 from tab where rn = 1
        union all
        select t.id, t.date, t.rn, greatest(t.date, c.date + interval '90' day)
        from cte c
        inner join tab t on t.id = c.id and t.rn = c.rn + 1
    )
select
    id,
    date,
    dense_rank() over(partition by id order by date0) grp,
    count(*)     over(partition by id order by date0, date) cnt
from cte

The first query in the with clause ranks records having the same id by increasing date; then, the recursive query traverses the data set and computes the starting date of each group. The last step is numbering the groups and computing the window count.

Upvotes: 1

Related Questions