Reputation: 125
I have two data frames and two questions. In both data frames df1 and df2, I can replace the NAs by 0.
df1
var1 <- c(1, NA, 2, NA, 4, 5, 5)
var2 <- c(1, 2, 3, 4, 5, 6, 7)
df1 <- data.frame(var1, var2)
df1$var1[is.na(df1$var1)] <- 0
df2
var1 <- c(1, NA, 2, NA, 4, 5, 9)
var2 <- c(1, 2, 3, 4, 5, 6, 7)
df2 <- data.frame(var1, var2)
df2$var1[is.na(df1$var1)] <- 0
But how would this work if I wanted to replace the NAs by the maximum value of var1 rather than 0? I thought it would be the following but it does not work.
df1$var1[is.na(df1$var1)] <- max(df1$var1)
Once this is solved, I would actually like to do this for a list of data frames using lapply.
mylist <- list(df1, df2)
My idea was something like the following - which does not work either.
lapply(mylist, function(x) x$var1[is.na(x$var1)] <- max(x$var1))
Many thanks for your help!
Upvotes: 2
Views: 1452
Reputation: 263332
Need to use na.rm=TRUE
in max
:
> df1$var1[is.na(df1$var1)] <- max(df1$var1, na.rm=TRUE)
>
>
> var1 <- c(1, NA, 2, NA, 4, 5, 9)
> var2 <- c(1, 2, 3, 4, 5, 6, 7)
> df2 <- data.frame(var1, var2)
> df2$var1[is.na(df1$var1)] <- max(df2$var1, na.rm=TRUE)
> df1
var1 var2
1 1 1
2 5 2
3 2 3
4 5 4
5 4 5
6 5 6
7 5 7
> df2
var1 var2
1 1 1
2 NA 2
3 2 3
4 NA 4
5 4 5
6 5 6
7 9 7
You attempt with the lapply
missed the fact that you would need to make the modified dataframe the last object evaluated. The results of [<-
is just the item and not the full dataframe:
lapply(mylist, function(x) {x$var1[is.na(x$var1)] <- max(x$var1, na.rm=TRUE); x})
Upvotes: 3