Reputation: 507
I have a dataset in which I want to fill missing values witht the closest non-missing value. I found two elegant solutions in the answers to this question, but I don't understand why they are not working for me.
Table:
create table Tab1(data date, V1 number);
insert into Tab1 values (date '2000-01-01', 1);
insert into Tab1 values (date '2000-02-01', 1);
insert into Tab1 values (date '2000-03-01', 1);
insert into Tab1 values (date '2000-04-01', 1);
insert into Tab1 values (date '2000-05-01', NULL);
insert into Tab1 values (date '2000-06-01', NULL);
insert into Tab1 values (date '2000-03-01', 2);
insert into Tab1 values (date '2000-04-01', 2);
insert into Tab1 values (date '2000-05-01', NULL);
insert into Tab1 values (date '2000-06-01', NULL);
select * from Tab1;
DATA V1
2000-01-01 1
2000-02-01 1
2000-03-01 1
2000-04-01 1
2000-05-01
2000-06-01
2000-03-01 2
2000-04-01 2
2000-05-01
2000-06-01
Attempt #1:
select A.*,
(case when V1 is null then lag(V1 ignore nulls)
over (partition by V1 order by V1, data)
else V1
end) V2
from Tab1 A;
Attempt #2:
select A.*,
(case when V1 is null
then last_value(V1 ignore nulls)
over (partition by V1 order by data
range between unbounded preceding and 1 preceding)
else V1
end) V2
from Tab1 A;
Both give me the same unwanted result:
DATA V1 V2
2000-01-01 1 1
2000-02-01 1 1
2000-03-01 1 1
2000-04-01 1 1
2000-03-01 2 2
2000-04-01 2 2
2000-05-01
2000-05-01
2000-06-01
What am I doing wrong?
Upvotes: 1
Views: 1382
Reputation: 131
Hi or you can try manual create the ignore null solution by refer to my below. thanks https://stackoverflow.com/a/57016373/10562099
Upvotes: 0
Reputation: 1270513
Your first version should work, with a slight tweak:
select A.*,
coalesce(V1, lag(V1 ignore nulls) over (order by data)) V2
from Tab1 A;
The tweak is to remove the partition by v1
from the lag()
. The coalesce()
is just my preference for simpler expressions.
The same tweak should work for the second version as well.
Your version doesn't work because the lag()
value must come from the same partition (or be null
). When you have partition by v1
, you are actually ensuring that v1
has the same value as in the current row.
Upvotes: 1