Reputation: 2609
I am trying to impute the dataframe with Hmisc impute model. I am able to impute the data for one column at a time but fail to loop over columns.
Below example - works fine but I would like to make it dynamic using a function:
impute_marks$col1 <- with(impute_marks, round(impute(col1, mean)),0)
Example:
impute_dataframe <- function()
{
for(i in 1:ncol(impute_marks))
{
impute_marks[is.na(impute_marks[,i]), i] <- with(impute_marks, round(impute(impute_marks[,i], mean)),0)
}
}
impute_dataframe
There is no error when I run the function but there is no imputed data as well to the dataset impute_marks.
Upvotes: 0
Views: 1047
Reputation: 886938
We can use na.aggregate
from zoo
which can be applied directly on the dataset
library(zoo)
round(na.aggregate(mydf))
# age1 age2
#1 1 3
#2 2 4
#3 2 3
#4 4 1
or in each column separately with lapply
mydf[] <- lapply(mydf, function(x) round(na.aggregate(x)))
By default, na.aggregate
gives the mean
. But, we can change the FUN
Upvotes: 1
Reputation: 10671
Hmisc::impute
is already a function, why not just use apply
and save a for
loop?:
library(Hmisc)
age1 <- c(1,2,NA,4)
age2 <- c(NA, 4, 3, 1)
mydf <- data.frame(age1, age2)
mydf
age1 age2
1 1 NA
2 2 4
3 NA 3
4 4 1
apply(mydf, 2, function(x) {round(impute(x, mean))})
age1 age2
1 1 3
2 2 4
3 2 3
4 4 1
EDIT: To keep mydf
as a data.frame you could coherce it back like this:
mydf <- as.data.frame(mydf)
But what I'd do is use another package purrr
which is nice set of tools around this apply/mapping idea. map_df
for example will always return a data.frame
object, there are a bunch of map_x
that you can see with ?map
library(purrr)
map_df(mydf, ~ round(impute(., mean)))
I know it is preferred to use the base R functions, but purrr
makes apply
style operations so much easier.
Upvotes: 2