Reputation: 257
I have per ID monthly cumulative counters, where in some months the counters are missing. I would like to replace these null
entries with values interpolated between the nearest preceding and the closest following non-empty row.
My table looks like this:
ID | Month | Counter |
---|---|---|
AAA | 2023-09 | 1000 |
AAA | 2023-10 | - |
AAA | 2023-11 | - |
AAA | 2023-12 | 4000 |
BBB | 2022-11 | 2000 |
BBB | 2022-12 | - |
BBB | 2023-01 | - |
BBB | 2023-02 | - |
BBB | 2023-03 | 4000 |
What I would like to have:
ID | Month | Counter |
---|---|---|
AAA | 2023-09 | 1000 |
AAA | 2023-10 | 2000 |
AAA | 2023-11 | 3000 |
AAA | 2023-12 | 4000 |
BBB | 2022-11 | 2000 |
BBB | 2022-12 | 2500 |
BBB | 2023-01 | 3000 |
BBB | 2023-02 | 3500 |
BBB | 2023-03 | 4000 |
How can this be done in PostgreSQL?
Upvotes: 1
Views: 52
Reputation: 7065
You can typically use window functions in order to calculate the missing values of Counter like in the following query :
SELECT id
, month
, counter
, first_value(counter) OVER w
+ percent_rank() OVER w
* ( last_value(counter) OVER w - first_value(counter) OVER w) AS new_counter
FROM test
WINDOW w AS (PARTITION BY id ORDER BY month RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY id, month
Result :
id | month | counter | new_counter |
---|---|---|---|
AAA | 2023-09-01 | 1000 | 1000 |
AAA | 2023-10-01 | null | 2000 |
AAA | 2023-11-01 | null | 3000 |
AAA | 2023-12-01 | 4000 | 4000 |
BBB | 2022-11-01 | 2000 | 2000 |
BBB | 2022-12-01 | null | 2500 |
BBB | 2023-01-01 | null | 3000 |
BBB | 2023-02-01 | null | 3500 |
BBB | 2023-03-01 | 4000 | 4000 |
Then you can use this query in an UPDATE function if you want to update your table :
With query AS
( SELECT id
, month
, first_value(counter) OVER w
+ percent_rank() OVER w
* ( last_value(counter) OVER w - first_value(counter) OVER w) AS new_counter
FROM test
WINDOW w AS (PARTITION BY id ORDER BY month RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
)
UPDATE test AS t
SET counter = q.new_counter
FROM query AS q
WHERE t.id = q.id
AND t.month = q.month
AND counter IS null ;
see the demo in dbfiddle
Upvotes: 1