Reputation: 847
I have the following table:
CREATE TABLE IF NOT EXISTS test (
date DATE,
value INT
);
INSERT INTO test VALUES
( '2022-01-11', NULL), -- Tues
( '2022-01-10', 5), -- Mon
( '2022-01-09', NULL), -- Sun
( '2022-01-08', 6), -- Sat
( '2022-01-07', NULL), -- Fri
( '2022-01-06', NULL), -- Thur
( '2022-01-05', 5), -- Wed
( '2022-01-04', NULL), -- Tues
( '2022-01-03', NULL), --Mon
( '2022-01-02', NULL), -- Sun
( '2022-01-01', NULL); -- Sat
What is a query that forward and backfills the value column, for calendar week (Monday through Sunday)
For a calendar week should forward fill null values up to the next non-null value, and backfill null values up to the previous non-null value. If the calendar week has all null values then the week should be null
The result should look like this:
date | VALUES |
---|---|
'2021-01-11' | 5 -- Tues, filled from Monday |
'2022-01-10' | 5 -- Mon |
'2022-01-09' | 6 -- Sun Filled from saturday |
'2022-01-08' | 6 -- Sat |
'2022-01-07' | 5 -- Fri Filled from Wed |
'2022-01-06' | 5 -- Thur Filled from Wed |
'2022-01-05' | 5 -- Wed |
'2022-01-04' | 5 -- Tues Filled from Wed |
'2022-01-03' | 5 --Mon Filled from Wed |
'2022-01-02' | Null -- Sun |
'2022-01-01' | Null -- Sat No values in week, therefor null |
Upvotes: 1
Views: 351
Reputation: 847
This query looks for the next or previous non-null value within the same week (or in the following week if there is no non-null value in the same week) for each date in the table, and uses it to fill any null value.
The query builds the groups by first partitioning the data by week, and then constructs a window frame within each group to find the next or previous non-null value.
SELECT date,
value,
COALESCE(value,
-- prior non-NULL value
(ARRAY_AGG(value)
FILTER (WHERE value IS NOT NULL)
OVER (PARTITION BY DATE_TRUNC('week', date)
ORDER BY date DESC
ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING))[1],
-- next non-NULL value
(ARRAY_AGG(value) FILTER (WHERE value IS NOT NULL) OVER
(PARTITION BY DATE_TRUNC('week', date)
ORDER BY date ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING))[1]
) AS substitute_level
FROM test
ORDER BY date DESC;
Demo on db fiddle
Upvotes: 0
Reputation: 222582
That would have been a good spot for last_value
with option ignore nulls
, but unfortunately Postgres does not support the latter.
Here is one way to do it in Postgres with window functions:
select date, value,
max(value) over(partition by date_trunc('week', date), grp) new_value
from (
select t.*,
count(value) over(partition by date_trunc('week', date) order by date) grp
from test t
) t
order by date desc
The approach is to build groups of consecutive rows of the same week that contain a non-null value, possibly followed by null values, using a window count
, and then bring the latest previous non-null value. You can run the subquery separately to see how groups are built.
If you also want the query to look for the next value when no previous value is availability in the same week, we can just expand the logic and define two groups for each row, then try both:
select date, value,
coalesce(
max(value) over(partition by date_trunc('week', date), grp1),
max(value) over(partition by date_trunc('week', date), grp2)
) new_value
from (
select t.*,
count(value) over(partition by date_trunc('week', date) order by date ) grp1,
count(value) over(partition by date_trunc('week', date) order by date desc) grp2
from test t
) t
order by date desc
Upvotes: 2