Reputation: 441
I have the following table in my Postgresql 9.1 database:
select * from ro;
date | shop_id | amount
-----------+----------+--------
2013-02-07 | 1001 | 3
2013-01-31 | 1001 | 2
2013-01-24 | 1001 | 1
2013-01-17 | 1001 | 5
2013-02-10 | 1001 | 10
2013-02-03 | 1001 | 4
2012-12-27 | 1001 | 6
2012-12-20 | 1001 | 8
2012-12-13 | 1001 | 4
2012-12-06 | 1001 | 3
2012-10-29 | 1001 | 3
I am trying to get a moving average comparing data against last 3 Thursdays without including the current Thursday. Here's my query:
select date, shop_id, amount, extract(dow from date),
avg(amount) OVER (PARTITION BY extract(dow from date) ORDER BY date DESC
ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING)
from ro
where extract(dow from date) = 4
This is the result given
date | shop_id | amount | date_part | avg
-----------+----------+--------+-----------+--------------------
2013-02-07 | 1001 | 3 | 4 | 2.0000000000000000
2013-01-31 | 1001 | 2 | 4 | 2.6666666666666667
2013-01-24 | 1001 | 1 | 4 | 4.0000000000000000
2013-01-17 | 1001 | 5 | 4 | 6.3333333333333333
2012-12-27 | 1001 | 6 | 4 | 6.0000000000000000
2012-12-20 | 1001 | 8 | 4 | 5.0000000000000000
2012-12-13 | 1001 | 4 | 4 | 3.5000000000000000
2012-12-06 | 1001 | 3 | 4 | 3.0000000000000000
I expect
date | shop_id | amount | date_part | avg
-----------+----------+--------+-----------+--------------------
2013-02-07 | 1001 | 3 | 4 | 2.6666666666666667
2013-01-31 | 1001 | 2 | 4 | 4.0000000000000000
2013-01-24 | 1001 | 1 | 4 | 6.3333333333333333
2013-01-17 | 1001 | 5 | 4 | 6.0000000000000000
2012-12-27 | 1001 | 6 | 4 | 5.0000000000000000
2012-12-20 | 1001 | 8 | 4 |
2012-12-13 | 1001 | 4 | 4 |
2012-12-06 | 1001 | 3 | 4 |
Upvotes: 34
Views: 26364
Reputation: 3535
I think a better answer might be:
SELECT date, shop_id, amount,
extract(dow from date) AS dow,
CASE WHEN count(amount) OVER w = 3
THEN avg(amount) OVER w END AS average_amt
FROM ro
WHERE extract(dow from date) = 4
WINDOW w AS (ORDER BY date DESC ROWS BETWEEN 1 FOLLOWING AND 3 FOLLOWING)
I think it is cleaner to use the same window for both checking the number of rows in the window and taking the average. (This also saves two window aggregations, as can be seen in the original answer.)
Regarding the claim in the earlier answer that "my query avoids unneeded computing by filtering Thursdays before applying the expensive window functions", this also applies to the query suggested by the OP and to my query, as appending EXPLAIN
to either shows.
Upvotes: 13
Reputation: 125574
select
"date",
shop_id,
amount,
extract(dow from date),
case when
row_number() over (order by date) > 3
then
avg(amount) OVER (
ORDER BY date DESC
ROWS BETWEEN 1 following AND 3 FOLLOWING
)
else null end
from (
select *
from ro
where extract(dow from date) = 4
) s
What is wrong with the OP's query is the frame specification:
ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING
Other than that my query avoids unneeded computing by filtering Thursdays before applying the expensive window functions.
If it is necessary to partition by shop_id then obviously add the partition by shop_id
to both functions, avg
and row_number
.
Upvotes: 23