Macter
Macter

Reputation: 132

Rolling sum for historical dates

I have a table orders that looks like this:

order_id sales_amount order_time store_id
1412412 30 2022/03/28 456
1551211 5 2022/03/27 145

I am interested in calculating the sales from stores that had their first order in the last 28 days, on a rolling basis. The following will give me this for the most recent day:

with first_order_dates AS (
select 
min(order_time) as first_order_time, 
store_id
from Orders
group by store_id
)

select 
dateadd(day,-1, cast(getdate() as date)) AS date, 
sum(sales_amount) AS new_revenue_last_28d
from Orders
left join first_order_dates
on first_order_dates.store_id = Orders.store_id
where first_order_time between dateadd(day,-29, cast(getdate() as date)) and dateadd(day,-1, cast(getdate() as date))
group by dateadd(day,-1, cast(getdate() as date))

Resulting in:

Date new_revenue_last_28d
2022/04/06 5400

What I want is to go back and calculate this for every historical day, i.e to end up with

Date new_revenue_last_28d
2022/04/06 5400
2022/04/05 5732
2022/04/04 4300

and so on so I can chart this. I have run out of ideas - how can I do this with only the info I have available? Using Snowflake ideally

Upvotes: 0

Views: 455

Answers (1)

Simeon Pilgrim
Simeon Pilgrim

Reputation: 25968

So if you want to only show sales for shops that have their first sale in the last 28 days, and for "those 28 days, have a rolling window of the sum of those sales"

WITH data as (
    select * from values
    (100, '2022-04-07'::date, 10),
    (100, '2022-04-06'::date, 8),
    (100, '2022-04-05'::date, 11),
    (100, '2022-04-01'::date, 12),
    (101, '2022-04-02'::date, 110),
    (101, '2022-04-01'::date, 120)
    t(store_id, order_date, sales_amount)
), store_valid_orders as (
    select 
        store_id
        ,order_date
        ,sales_amount
    from data
    qualify min(order_date) over(partition by store_id) >= current_date() - 28
), those_28_days as (
    select current_date() - row_number()over(order by null) + 1 as date
    from table(generator(ROWCOUNT => 29))
), day_join_sales as (
    select 
        d.date
        ,s.store_id
        ,sum(s.sales_amount) as sales_amount
    from those_28_days as d
    left join store_valid_orders as s on d.date = s.order_date
    group by 1,2
)
select 
    date
    ,store_id
    ,sum(sales_amount) over(partition by store_id order by date rows between 28 preceding and current row ) as prior_28_days_sales
from day_join_sales
qualify store_id is not null;

gives:

DATE STORE_ID PRIOR_28_DAYS_SALES
2022-04-01 100 12
2022-04-05 100 23
2022-04-06 100 31
2022-04-07 100 41
2022-04-01 101 120
2022-04-02 101 230

that is actually more complex that it needs to be.. but I half have the concept for solving rolling windows of days, which include the first sales with respect to rolling date. Which is more complex, but the above might be enough to answer your question. So I will stop here.

Take 2:

with daily 28 days of sales per store, rolled into single daily total:

WITH data as (
    select * from values
    (100, '2022-04-07'::date, 10),
    (100, '2022-04-06'::date, 8),
    (100, '2022-04-05'::date, 11),
    (100, '2022-04-01'::date, 12),
    (101, '2022-04-02'::date, 110),
    (101, '2022-04-01'::date, 120)
    t(store_id, order_date, sales_amount)
), store_first_orders as (
    select 
        store_id
        ,min(order_date) as first_order
    from data
    group by 1
), _29_rows as (
    select
        row_number()over(order by null) - 1 as rn
    from table(generator(ROWCOUNT => 29))
), those_29_rows as (
    select
        v.store_id
        ,dateadd(day, r.rn, v.first_order) as date
    from _29_rows as r
    full join store_first_orders as v
), first_28_days_of_data as (
    select 
        r.store_id
        ,r.date
        ,d.sales_amount
    from those_29_rows r
    left join data as d
        on d.store_id = r.store_id AND d.order_date = r.date
), per_site_dailies as (
    select 
        store_id
        ,date
        ,sum(sales_amount) over(partition by store_id order by date) as roll_sales
    from first_28_days_of_data
    order by 2,1
)
select 
    date,
    sum(roll_sales) as new_revenue_last_28d
from per_site_dailies
group by 1
having date <= current_date()
order by 1;

gives:

DATE NEW_REVENUE_LAST_28D
2022-04-01 132
2022-04-02 242
2022-04-03 242
2022-04-04 242
2022-04-05 253
2022-04-06 261
2022-04-07 271
2022-04-08 271

Upvotes: 1

Related Questions