Aby
Aby

Reputation: 167

Replace NA with previous value for certain column in R

I want to convert NAs to a certain value, based on the given variable value for that specific ID. Sample query: df1 ---> df2

df1 = data.frame(ID=c(1,1, 1, 1, 2,2,2,2,3,3,3,3),WHR=c(0.8,NA, NA, NA,1.0, NA, NA,NA,1.1, NA, NA, NA))

df2=data.frame(ID=c(1,1, 1, 1, 2,2,2,2,3,3,3,3),WHR=c(0.8,0.8, 0.8, 0.8,1.0, 1.0,1.0,1.0, 1.1, 1.1,1.1,1.1))`

What I tried

R fill in NA with previous row value with condition:

library(xts)
df1[,WHR:=na.locf("WHR", fromlast=TRUE, by = ID)` 

Got error:

could not find function ":="; 

I used this code because I have hundreds of ID values and I would like an automatic code that changes NAs in a particular column based on ID.

How can I convert df1 to df2? (pls explain your code as well, so it may help other beginner users). Thank you!

Upvotes: 2

Views: 2134

Answers (3)

RustamA
RustamA

Reputation: 91

mean by ID:

for (i in unique(df1$ID)) df1[df1$ID==i & is.na(df1[,'WHR']),'WHR'] <- mean(df2[df2$ID==i,'WHR'])

Upvotes: 0

Karsten W.
Karsten W.

Reputation: 18420

You could construct a mapping for the missing values:

idx <- !is.na(df1[,"WHR"])
map <- setNames(df1[idx,"WHR"], df1[idx,"ID"])

and then apply this map to the NA values

df2[!idx, "WHR2"] <- map[df2[!idx, "ID"]]

Upvotes: 1

Zelazny7
Zelazny7

Reputation: 40628

Using base R you can broadcast the first non-NA using the cumsum of their locations and the ave function:

df2$WHR.Comp = ave(df1$WHR, cumsum(!is.na(df1$WHR)), FUN=function(x) x[1])

> df2
   ID WHR WHR.Comp
1   1 0.8      0.8
2   1 0.8      0.8
3   1 0.8      0.8
4   1 0.8      0.8
5   2 1.0      1.0
6   2 1.0      1.0
7   2 1.0      1.0
8   2 1.0      1.0
9   3 1.1      1.1
10  3 1.1      1.1
11  3 1.1      1.1
12  3 1.1      1.1

Upvotes: 2

Related Questions