Reputation: 33459
Let's say I have the following table:
CREATE TABLE stock_prices (
stock TEXT NOT NULL,
date DATE NOT NULL,
price REAL NOT NULL,
UNIQUE (stock, date)
);
I want to calculate for each day, the highest price for each stock in the preceding 3-month window.
I can't do a simple self join with date - INTERVAL(3 'MONTH')
since my stock_price
table has some "holes" for holidays and weekends. Similarly a naive window also does not work:
SELECT
stock,
date,
LAST_VALUE(price) OVER (PARTITION BY stock ORDER BY date ROWS 90 PRECEDING)
FROM stock_prices
I almost want a window frame here with a condition based on the current row. Is that possible in PostgreSQL?
Upvotes: 7
Views: 2010
Reputation: 1673
The generate_series option should work well, although since months aren't always 30 days it won't always align to calendar months.
If you want to use an interval, you can also do a self join and aggregate. This will join each row onto all rows that meet the criteria (in this case I've set the interval at 1 week), and get the Max value within that result set:
select a.stock,
a.date,
a.price,
max( b.price )
from stock_prices as a
left join
stock_prices as b
on a.stock = b.stock
and b.date between (a.date - interval '7 days') and a.date
group by a.stock,
a.date,
a.price
order by a.stock,
a.date
SQL fiddle here: http://sqlfiddle.com/#!15/bbec8/2
Upvotes: 1
Reputation: 121634
You can fill up the table with the missing rows using the function generate_series ()
, so a window function would return correct data. You can select the period of report specifying start and end dates in generate_series ()
:
select
stock,
date,
price,
max(price) over (partition by stock order by date rows 90 preceding)
from (
select d::date as date, s.stock, sp.price
from generate_series('2016-01-01'::date, '2016-07-28', '1d') g(d)
cross join (
select distinct stock
from stock_prices
) s
left join stock_prices sp on g.d = sp.date and s.stock = sp.stock
) s
order by 1, 2;
This alternative solution with a simple subquery:
select
stock,
date,
price,
(
select max(price)
from stock_prices sp2
where sp2.stock = sp1.stock
and sp2.date >= sp1.date- interval '90days'
and sp2.date <= sp1.date
) highest_price
from
stock_prices sp1
order by 1, 2;
will be much more expensive. In this case you should obligatory use the index
create index on stock_prices (stock, date);
Upvotes: 7