NatalieGoodman
NatalieGoodman

Reputation: 21

Rolling Sum Calculation Based on 2 Date Fields

Giving up after a few hours of failed attempts.

My data is in the following format - event_date can never be higher than create_date. I'd need to calculate on a rolling n-day basis (let's say 3) the sum of units where the create_date and event_date were within the same 3-day window. The data is illustrative but each event_date can have over 500+ different create_dates associated with it and the number isn't constant. There is a possibility of event_dates missing.

So let's say for 2022-02-03, I only want to sum units where both the event_date and create_date values were between 2022-02-01 and 2022-02-03.

event_date create_date rowid units
2022-02-01 2022-01-20 1 100
2022-02-01 2022-02-01 2 100
2022-02-02 2022-01-21 3 100
2022-02-02 2022-01-23 4 100
2022-02-02 2022-01-31 5 100
2022-02-02 2022-02-02 6 100
2022-02-03 2022-01-30 7 100
2022-02-03 2022-02-01 8 100
2022-02-03 2022-02-03 9 100
2022-02-05 2022-02-01 10 100
2022-02-05 2022-02-03 11 100

The output I'd need to get to (added in brackets the rows I'd need to include in the calculation for each date but my result would only need to include the numerical sum) . I tried calculating using either dates but neither of them returned the results I needed.

date units
2022-02-01 100 (Row 2)
2022-02-02 300 (Row 2,5,6)
2022-02-03 300 (Row 2,6,8,9)
2022-02-04 200 (Row 6,9)
2022-02-05 200 (Row 9,11)

In Python I solved above with a definition that looped through filtering a dataframe for each date but I am struggling to do the same in SQL.

Thank you!

Upvotes: 0

Views: 85

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172944

Consider below approach

with events_dates as (
  select date from (
    select min(event_date) min_date, max(event_date) max_date
    from your_table
  ), unnest(generate_date_array(min_date, max_date)) date
)
select date, sum(units) as units, string_agg('' || rowid) rows_included
from events_dates
left join your_table
on create_date between date - 2 and date
and event_date between date - 2 and date
group by date         

if applied to sample data in your question - output is

enter image description here

Upvotes: 1

Related Questions