Reputation: 1097
I have a following table in Vertica:
Item_id event_date Price
A 2019-01-01 100
A 2019-01-04 200
B 2019-01-05 150
B 2019-01-06 250
B 2019-01-09 350
As you see, there are some missing dates between 2019-01-04 and 2019-01-01, and also 2019-01-09 - 2019-01-06. What I need is to add for each item_id missing dates between the existing ones, and as the price cell will be NULL, fill it with the previous date Price. So it will be like this:
Item_id event_date Price
A 2019-01-01 100
A 2019-01-02 100
A 2019-01-03 100
A 2019-01-04 200
B 2019-01-05 150
B 2019-01-06 250
B 2019-01-07 250
B 2019-01-08 250
B 2019-01-09 350
I tried to go with
SELECT Item_id, event_date
CASE Price WHEN 0 THEN NVL( LAG( CASE Price WHEN 0 THEN NULL ELSE Price END ) IGNORE NULLS OVER ( ORDER BY NULL ), 0 ) ELSE Price END AS Price_new
FROM item_price_table
from this article https://blog.jooq.org/2015/12/17/how-to-fill-sparse-data-with-the-previous-non-empty-value-in-sql/ , but it seems it works for SQL Server, but not for Vertica, as there are no IGNORE NULLS function... Does anyone know how to deal with it?
Upvotes: 1
Views: 1267
Reputation: 6749
I was waiting for that one....!
I just love Vertica's TIMESERIES clause ! It works on TIMESTAMPs, not DATEs, so I have to cast back and forth, but it's unbeatable.
See here:
WITH
input(item_id,event_dt,price) AS (
SELECT 'A',DATE '2019-01-01',100
UNION ALL SELECT 'A',DATE '2019-01-04',200
UNION ALL SELECT 'B',DATE '2019-01-05',150
UNION ALL SELECT 'B',DATE '2019-01-06',250
UNION ALL SELECT 'B',DATE '2019-01-09',350
)
SELECT
item_id
, event_dts::DATE AS event_dt
, TS_FIRST_VALUE(price) AS price
FROM input
TIMESERIES event_dts AS '1 DAY' OVER(PARTITION BY item_id ORDER BY event_dt::timestamp)
-- out item_id | event_dt | price
-- out ---------+------------+-------
-- out A | 2019-01-01 | 100
-- out A | 2019-01-02 | 100
-- out A | 2019-01-03 | 100
-- out A | 2019-01-04 | 200
-- out B | 2019-01-05 | 150
-- out B | 2019-01-06 | 250
-- out B | 2019-01-07 | 250
-- out B | 2019-01-08 | 250
-- out B | 2019-01-09 | 350
-- out (9 rows)
-- out
-- out Time: First fetch (9 rows): 68.057 ms. All rows formatted: 68.221 ms
;
Need explanations?
Happy playing ...
Upvotes: 1
Reputation: 1270421
Let me assume you have a calendar table. In Vertica, you can then use last_value(ignore nulls)
to fill in the rest:
select c.event_date, i.item_id,
coalesce(ipt.price,
last_value(ipt.price ignore nulls) over (partition by i.item_id order by c.event_date)
) as price
from calendar c cross join
(select distinct item_id from item_price_table) i left join
item_price_table ipt
on i.item_it = ipt.item_id and c.date = ipt.event_date
Upvotes: 2
Reputation: 1601
I don't have the code on the top of my head, but to add the missing dates you'll want to create a Calendar table and join to that. Then you can use the lag function to replace the null Price with the one above it. There's plenty of code if you search CTE to create calendar table.
Upvotes: 0