Reputation: 535
I am trying to fill null values in a numeric table using duckdb.
Let's consider the simple table df
:
x | y |
---|---|
0 | 2.0 |
1 | NULL |
3 | NULL |
5 | 7.0 |
6 | 4.0 |
8 | 5.0 |
9 | NULL |
that you can create with
CREATE TABLE df (x NUMERIC, y NUMERIC)
INSERT INTO df VALUES (0, 2), (1, NULL), (3, NULL), (5, 7), (6, 4), (8, 5), (9, NULL)
then i would like to create a SELECT statement to retrieve:
x | y |
---|---|
0 | 2.0 |
1 | 3.0 |
3 | 5.0 |
5 | 7.0 |
6 | 4.0 |
8 | 5.0 |
9 | NULL |
I think that a correlated subquery which finds the following and previous non-null values and then computes the interpolation should work:
SELECT
x,
COALESCE(
y,
(
SELECT
prev.y + ( (next.y - prev.y) * (parent.x - prev.x) / (next.x - prev.x) )
FROM
( SELECT x, y FROM df WHERE x <= parent.x and y is not null ORDER BY x DESC LIMIT 1 ) AS prev
CROSS JOIN
( SELECT x, y FROM df WHERE x >= parent.x and y is not null ORDER BY x ASC LIMIT 1 ) AS next
)
) AS y
FROM
df parent
But duckdb is complaining with
INTERNAL Error: Failed to bind column reference "limit_rownum" [30.0] (bindings: [8.0 8.1 8.2 28.0])
EDIT: Adjusted the example to ensure that x
and y
columns are both taken into account for interpolating the gaps.
Upvotes: 1
Views: 947
Reputation: 35583
I believe you can bypass the error message by adopting row_number() as proposed in this duckdb bug reference. nb: The fiddle below uses postgres as I'm unable to test in duckdb:
CREATE TABLE df (x NUMERIC(3,1), y NUMERIC(3,1));
INSERT INTO df VALUES (0, 2), (1, NULL), (3, NULL), (5, 7), (6, 4), (8, 5), (9, NULL)
SELECT
x,
COALESCE(
y,
(
SELECT
prev.y + ( (next.y - prev.y) * (parent.x - prev.x) / (next.x - prev.x) )
FROM
( SELECT x, y, ROW_NUMBER() OVER (ORDER BY x DESC) as rn FROM df WHERE x <= parent.x and y is not null ) AS prev
CROSS JOIN
( SELECT x, y, ROW_NUMBER() OVER (ORDER BY x ASC) as rn FROM df WHERE x >= parent.x and y is not null ) AS next
WHERE
prev.rn = next.rn
)
) AS y
FROM
df parent
x | y |
---|---|
0.0 | 2.0 |
1.0 | 3.00000000000000000000 |
3.0 | 5.0000000000000000 |
5.0 | 7.0 |
6.0 | 4.0 |
8.0 | 5.0 |
9.0 | null |
Upvotes: 2