Reputation: 696
I am using caret train function with the preProcess option:
fit <- train(form,
data=train,
preProcess=c("YeoJohnson","center","scale","bagImpute"),
method=model,
metric = "ROC",
tuneLength = tune,
trControl=fitControl)
This preprocesses the training data. However, when I predict, the observations with NAs, they are omitted even though I have bagImpute as an option. I know there is a na.action parameter on predict.train, but I can't get it to work.
predict.train(model, newdata=test, na.action=???)
Is it correct to assume that the predict function automatically preprocesses the new data because the model was trained using the preProcess option? If so, shouldn't the new data be imputed and processed the same way as train? What am i doing wrong?
Thanks for any help.
Upvotes: 0
Views: 1195
Reputation: 14331
You would use na.action = na.pass
. The problem is, while making a working example, I found a bug that occurs with the formula method for train
and imputation. Here is an example without the formula method:
library(caret)
set.seed(1)
training <- twoClassSim(100)
testing <- twoClassSim(100)
testing$Linear05[4] <- NA
fitControl <- trainControl(classProbs = TRUE,
summaryFunction = twoClassSummary)
set.seed(2)
fit <- train(x = training[, -ncol(training)],
y = training$Class,
preProcess=c("YeoJohnson","center","scale","bagImpute"),
method="lda",
metric = "ROC",
trControl=fitControl)
predict(fit, testing[1:5, -ncol(testing)], na.action = na.pass)
The bug will be fixed on the next release of the package.
Max
Upvotes: 1