Taylor
Taylor

Reputation: 43

Dateadd for moving 7 day avg VS (partition by price order by date rows between 6 preceding and current row)

Will the statement

Select productid, avg(price) as avgPrice, date as salesDate
from sales
where date between getdate() and dateadd(day, -7, getdate)
group by date

return the same results as

select productid, 
 avg(price) over (partition by productid order by date rows between 6 preceding and current row) 
 as avgPrice,
 date as salesDate
from sales
group by productid, date

Upvotes: -1

Views: 51

Answers (2)

T N
T N

Reputation: 10024

The expression avg(price) over(...) is not valid in your original post, because a window function must reference a value that would otherwise be valid in the select list. Since price is not part of the group by, it is not valid in that context.

The expression avg(avg(price)) over(...) would be valid (since avg(price) is an allowed select list item), but that would not give the correct result because the average of averages of subsets is not the same as the average of the whole set.

Something like sum(sum(price)) over(...) / sum(count(*)) over(...) might work, but if there are gaps in sale dates, your rows range might reach back more than than the intended 7 days. You would need incorporate a calendar table or date generator into you solution to fill in any missing dates for each product.

The following will generate a range of dates, cross join them with a distinct list of product IDs, and then left join that combination with the sales data. The combined data is then grouped and aggregated to produce the desired results.

-- Correct results are obtained by using a calendar table to fill in any date gaps.
with calendar as (
    select min(date) as date, max(date) as endDate
    from sales
    union all
    select dateadd(day, 1, date), endDate
    from calendar
    where date < enddate
),
products as (
    select distinct productid
    from sales
)
select
    p.productid,
    c.date as salesDate, 
    count(*) as numSales,
    avg(price) as avg1DayPrice,
    sum(sum(s.price)) over (partition by p.productid
                            order by c.date
                            rows between 6 preceding and current row)
        / sum(count(s.price)) over(partition by p.productid
                                   order by c.date
                                   rows between 6 preceding and current row)
        as avg7DayPrice
from calendar c
cross join products p
left join sales s
    on s.date = c.date
    and s.productid = p.productid
group by p.productid, c.date
order by p.productid, c.date

If you are using SQL Server 2022 (or later), you can move the repeated OVER specifications to a named WINDOW specification.

select ...
    sum(sum(s.price)) over last7days
        / sum(count(s.price)) over last7days
        as avg7DayPrice
...
window last7days as (partition by p.productid
                     order by c.date
                     rows between 6 preceding and current row)
...

An altogether different approach is to use a subquery instead of the window functions to calculate the 7-day average.

select
    s.productid,
    s.date as salesDate,
    count(*) as numSales,
    avg(s.price) as avg1DayPrice,
    (
        select avg(s2.price)
        from sales s2
        where s2.productid = s.productid
        and s2.date between dateadd(day, -6, s.date) and s.date
    ) as avg7DayPrice
from sales s
group by s.productid, s.date
order by s.productid, s.date

See this db<>fiddle for a demo of the various techniques discussed, with some simple test data.

Upvotes: 1

Bart McEndree
Bart McEndree

Reputation: 3230

There are many data sets where the 2 queries (if properly written) will return different results.

First off, in the first query the WHERE clause will restrict dates for the last week. The Second query does not have a WHERE clause and will return all rows in the sales table.

Another issues is, the first query has grouping on date so you will only get 1 row per date. The second query groups by productid and date so you will get 1 row per productid per day.

Upvotes: 1

Related Questions