Reputation: 11
We can basically use missForest package for imputing missing values in R(for both categorical and numeric).But this approach requires a complete response variable for training the forest. So,how to impute missing values in the test data set using this missForest package ,because we do not have any response variable in the test data set?
Upvotes: 1
Views: 4203
Reputation: 23598
You can just use missForest. No need for the response variable. See code below.
library(missForest)
# remove response variable
my_iris <- iris[, -5]
## Artificially produce missing values using the 'prodNA' function:
set.seed(81)
iris.mis <- prodNA(my_iris, noNA = 0.2)
#impute
iris.imp <- missForest(iris.mis, verbose = TRUE)
#out of bag error
iris.imp$OOBerror
# not available if there is no response variable
iris.imp$error
# Imputed matrix
iris.imp$ximp
Upvotes: 2