Gisela
Gisela

Reputation: 1244

select within period plus last before period

thanks to everyone who took the time to comment and answer.

-

I have a price history table like that (pseudocode):

table price_history (
    product_id,
    price,
    changed_date
)

in which the historical prices of some products are stored:

1,  1.0, '2017-12-18'
1,  1.2, '2017-12-20'
1,  0.9, '2018-04-20'
1,  1.1, '2018-07-20'
1,  1.3, '2018-07-22'
2, 10.0, '2017-12-15'
2, 11.0, '2017-12-16'
2,  9.9, '2018-01-02'
2, 10.3, '2018-04-04

Now I want the prices of some products within a certain period. Eg. between 2018-01-01 and now.

The simple approach:

    SELECT * FROM price_history
      WHERE product_id in (1,2) AND changed_date >= 2018-01-01

is not ok, since the individual price for each product from 2018-01-01 until the first price change is not included:

1,  0.9, '2018-04-20'
1,  1.1, '2018-07-20'
1,  1.3, '2018-07-22'
2,  9.9, '2018-01-02'
2, 10.3, '2018-04-04

But it is crucial to know the prices from the start of the period.

So, in addition to the price changes within the period, the last change before must also included. The result should be like so:

1,  1.2, '2017-12-20'
1,  0.9, '2018-04-20'
1,  1.1, '2018-07-20'
1,  1.3, '2018-07-22'
2, 11.0, '2017-12-16'
2,  9.9, '2018-01-02'
2, 10.3, '2018-04-04

Q: how to specify such a select statement?

Edit:

The test scenario and the solution from Ajay Gupta

CREATE TABLE price_history (
        product_id integer,
        price float,
        changed_date timestamp
    );

INSERT INTO price_history (product_id,price,changed_date) VALUES
    (1, 1.0, '2017-12-18'),
    (1, 1.2, '2017-12-20'),
    (1, 0.9, '2018-04-20'),
    (1, 1.1, '2018-07-20'),
    (1, 1.3, '2018-07-22'),
    (2, 10.0, '2017-12-15'),
    (2, 11.0, '2017-12-16'),
    (2, 9.9, '2018-01-02'),
    (2, 10.3, '2018-04-04');

Winning Select:

with cte1 as
  (Select *, lag(changed_date,1,'01-01-1900')
  over(partition by product_id order by changed_date)
    as FromDate from price_history),
  cte2 as (Select product_id, max(FromDate)
           as changed_date from cte1
           where '2018-01-01'
             between FromDate and changed_date group by product_id)
  Select p.* from price_history p
    join cte2 c on p.product_id = c.product_id
  where p.changed_date >= c.changed_date
  order by product_id,changed_date;

Result:

 product_id | price |    changed_date     
------------+-------+---------------------
          1 |   1.2 | 2017-12-20 00:00:00
          1 |   0.9 | 2018-04-20 00:00:00
          1 |   1.1 | 2018-07-20 00:00:00
          1 |   1.3 | 2018-07-22 00:00:00
          2 |    11 | 2017-12-16 00:00:00
          2 |   9.9 | 2018-01-02 00:00:00
          2 |  10.3 | 2018-04-04 00:00:00

I must admit, this is way beyond my limited (PG-)SQL skills.

Upvotes: 2

Views: 127

Answers (5)

Abelisto
Abelisto

Reputation: 15624

The solution with union is still simpler but not realized correctly in other answers. So:

SELECT * FROM price_history
      WHERE product_id in (1,2) AND changed_date >= '2018-01-01'
union all
(
  select distinct on (product_id)
    *
  from price_history
  where product_id in (1,2) AND changed_date < '2018-01-01'
  order by product_id, changed_date desc)
order by product_id, changed_date;

Demo

Upvotes: 0

Ajay Gupta
Ajay Gupta

Reputation: 1845

Using Lag and cte

with cte1 as (
  Select *, 
         lag(changed_date,1,'01-01-1900') over(partition by product_id order by changed_date) as FromDate 
  from price_history
), cte2 as  (
   Select product_id, max(FromDate) as changed_date 
   from cte1  
   where '2018-01-01' between FromDate and changed_date 
   group by product_id
)
Select p.* 
from price_history p 
  join cte2 c on p.product_id = c.product_id
where p.changed_date >= c.changed_date;

Upvotes: 1

Hooman Bahreini
Hooman Bahreini

Reputation: 15559

If you did have the option to change your table structure, a different approach would be to have both start_date and end_date in your table, this way your records would not depend on prev/next row and your query becomes easier to write. See Slowly changing dimension - Type 2

If you want to solve the problem with existing structure, in PostgresSQL you can use LIMIT 1 to get latest record before changed_date:

SELECT 
    * 
FROM 
    price_history 
WHERE 
    product_id in (1,2) 
    AND changed_date >= '2018-01-01'
UNION ALL
-- this would give you the latest price before changed_date
SELECT 
    * 
FROM 
    price_history 
WHERE 
   product_id in (1,2) 
   AND changed_date < '2018-01-01'
ORDER BY
   changed_date DESC
LIMIT 1

Upvotes: 0

Zaynul Abadin Tuhin
Zaynul Abadin Tuhin

Reputation: 31993

You need 1st change date and all other date >"2018-01-01"

 select product_id,price, changed_date
    from
    (
    select product_id,price, changed_date,
    row_number() over(partition by product_id order by changed_date ) as rn
    from price_history
    ) x
    where x.rn = 2 and product_id in (1,2);
union all
select product_id,price, changed_datefrom from price_history
where product_id in (1,2) and changed_date >= '2018-01-01'

Upvotes: 0

zacs
zacs

Reputation: 132

I guess this is what you are looking for

SELECT Top 1 * FROM price_history WHERE product_id in (1,2) AND changed_date < 2018-01-01
UNION ALL
SELECT * FROM price_history WHERE product_id in (1,2) AND changed_date >= 2018-01-01

Upvotes: 1

Related Questions