boski
boski

Reputation: 2467

Fill null values with closest upper and lower non-null value element difference

Is there a way to achieve the following goal using sql in Snowflake?

Original Table:
Order | Value |
1 | 80 |
2 | null |
3 | null |
4 | 20 |

Output table:
Order | Value |
1 | 80 |
2 | 60 |
3 | 40 |
4 | 20 |

I understand this would requiere a fairly complex window operation and may be an overreach for standard sql.

Upvotes: 1

Views: 271

Answers (1)

Simeon Pilgrim
Simeon Pilgrim

Reputation: 26043

Thus assume order is only a direction of the steps and not a linear mapping of the step size. Otherwise rn can be replace with ov to match the steps in your original order.

And it has no partitioning if you are running multiple data sets concurrently.

SELECT * 
    ,lead(v) ignore nulls over (order by ov) as nv
    ,lag(v) ignore nulls over (order by ov) as pv
    ,lead(nrn) ignore nulls over (order by ov) as nnrn
    ,lag(nrn) ignore nulls over (order by ov) as pnrn
    ,nvl(v, ((nv-pv)/(nnrn-pnrn)*(rn-pnrn))+pv) as value
FROM (
    SELECT *
        ,row_number() over (order by ov) as rn
        ,nvl2(v,rn,null) as nrn
    FROM values
        (1,80),
        (2,null),
        (3,null),
        (4,20)
         t(ov, v)
)
order by ov;

gives:

OV V RN NRN NV PV NNRN PNRN VALUE
1 80 1 1 20 4 80
2 2 20 80 4 1 60
3 3 20 80 4 1 40
4 20 4 4 80 1 20

or if you want it to look tidy:

WITH data as (
    SELECT *   
    FROM values
        (1,80),
        (2,null),
        (3,null),
        (4,20),
        (5,null),
        (6,null),
        (7,11)
         t(ov, v)
)
SELECT ov, v, value 
FROM (
    SELECT * 
        ,lead(v) ignore nulls over (order by ov) as nv
        ,lag(v) ignore nulls over (order by ov) as pv
        ,lead(nrn) ignore nulls over (order by ov) as nnrn
        ,lag(nrn) ignore nulls over (order by ov) as pnrn
        ,nvl(v, ((nv-pv)/(nnrn-pnrn)*(rn-pnrn))+pv) as value
    FROM (
        SELECT *
            ,row_number() over (order by ov) as rn
            ,nvl2(v,rn,null) as nrn
        FROM data
    )
)
order by ov;
OV V VALUE
1 80 80
2 60
3 40
4 20 20
5 17
6 14
7 11 11

Upvotes: 2

Related Questions