Reputation: 1244
thanks to everyone who took the time to comment and answer.
-
I have a price history table like that (pseudocode):
table price_history (
product_id,
price,
changed_date
)
in which the historical prices of some products are stored:
1, 1.0, '2017-12-18'
1, 1.2, '2017-12-20'
1, 0.9, '2018-04-20'
1, 1.1, '2018-07-20'
1, 1.3, '2018-07-22'
2, 10.0, '2017-12-15'
2, 11.0, '2017-12-16'
2, 9.9, '2018-01-02'
2, 10.3, '2018-04-04
Now I want the prices of some products within a certain period. Eg. between 2018-01-01 and now.
The simple approach:
SELECT * FROM price_history
WHERE product_id in (1,2) AND changed_date >= 2018-01-01
is not ok, since the individual price for each product from 2018-01-01 until the first price change is not included:
1, 0.9, '2018-04-20'
1, 1.1, '2018-07-20'
1, 1.3, '2018-07-22'
2, 9.9, '2018-01-02'
2, 10.3, '2018-04-04
But it is crucial to know the prices from the start of the period.
So, in addition to the price changes within the period, the last change before must also included. The result should be like so:
1, 1.2, '2017-12-20'
1, 0.9, '2018-04-20'
1, 1.1, '2018-07-20'
1, 1.3, '2018-07-22'
2, 11.0, '2017-12-16'
2, 9.9, '2018-01-02'
2, 10.3, '2018-04-04
Q: how to specify such a select statement?
Edit:
The test scenario and the solution from Ajay Gupta
CREATE TABLE price_history (
product_id integer,
price float,
changed_date timestamp
);
INSERT INTO price_history (product_id,price,changed_date) VALUES
(1, 1.0, '2017-12-18'),
(1, 1.2, '2017-12-20'),
(1, 0.9, '2018-04-20'),
(1, 1.1, '2018-07-20'),
(1, 1.3, '2018-07-22'),
(2, 10.0, '2017-12-15'),
(2, 11.0, '2017-12-16'),
(2, 9.9, '2018-01-02'),
(2, 10.3, '2018-04-04');
Winning Select:
with cte1 as
(Select *, lag(changed_date,1,'01-01-1900')
over(partition by product_id order by changed_date)
as FromDate from price_history),
cte2 as (Select product_id, max(FromDate)
as changed_date from cte1
where '2018-01-01'
between FromDate and changed_date group by product_id)
Select p.* from price_history p
join cte2 c on p.product_id = c.product_id
where p.changed_date >= c.changed_date
order by product_id,changed_date;
Result:
product_id | price | changed_date
------------+-------+---------------------
1 | 1.2 | 2017-12-20 00:00:00
1 | 0.9 | 2018-04-20 00:00:00
1 | 1.1 | 2018-07-20 00:00:00
1 | 1.3 | 2018-07-22 00:00:00
2 | 11 | 2017-12-16 00:00:00
2 | 9.9 | 2018-01-02 00:00:00
2 | 10.3 | 2018-04-04 00:00:00
I must admit, this is way beyond my limited (PG-)SQL skills.
Upvotes: 2
Views: 127
Reputation: 15624
The solution with union
is still simpler but not realized correctly in other answers. So:
SELECT * FROM price_history
WHERE product_id in (1,2) AND changed_date >= '2018-01-01'
union all
(
select distinct on (product_id)
*
from price_history
where product_id in (1,2) AND changed_date < '2018-01-01'
order by product_id, changed_date desc)
order by product_id, changed_date;
Upvotes: 0
Reputation: 1845
Using Lag
and cte
with cte1 as (
Select *,
lag(changed_date,1,'01-01-1900') over(partition by product_id order by changed_date) as FromDate
from price_history
), cte2 as (
Select product_id, max(FromDate) as changed_date
from cte1
where '2018-01-01' between FromDate and changed_date
group by product_id
)
Select p.*
from price_history p
join cte2 c on p.product_id = c.product_id
where p.changed_date >= c.changed_date;
Upvotes: 1
Reputation: 15559
If you did have the option to change your table structure, a different approach would be to have both start_date and end_date in your table, this way your records would not depend on prev/next row and your query becomes easier to write. See Slowly changing dimension - Type 2
If you want to solve the problem with existing structure, in PostgresSQL you can use LIMIT 1
to get latest record before changed_date:
SELECT
*
FROM
price_history
WHERE
product_id in (1,2)
AND changed_date >= '2018-01-01'
UNION ALL
-- this would give you the latest price before changed_date
SELECT
*
FROM
price_history
WHERE
product_id in (1,2)
AND changed_date < '2018-01-01'
ORDER BY
changed_date DESC
LIMIT 1
Upvotes: 0
Reputation: 31993
You need 1st change date and all other date >"2018-01-01"
select product_id,price, changed_date
from
(
select product_id,price, changed_date,
row_number() over(partition by product_id order by changed_date ) as rn
from price_history
) x
where x.rn = 2 and product_id in (1,2);
union all
select product_id,price, changed_datefrom from price_history
where product_id in (1,2) and changed_date >= '2018-01-01'
Upvotes: 0
Reputation: 132
I guess this is what you are looking for
SELECT Top 1 * FROM price_history WHERE product_id in (1,2) AND changed_date < 2018-01-01
UNION ALL
SELECT * FROM price_history WHERE product_id in (1,2) AND changed_date >= 2018-01-01
Upvotes: 1