Reputation: 4926
I am currently using caretEnsemble
package in R for combining multiple models trained in caret. I have got the list of final trained models (say model_list
) using caretList
function from the same package as follows.
model_list <- caretList(
x = input_predictors,
y = input_labels,
metric = 'Accuracy',
tuneList = list(
randomForestModel = caretModelSpec(method='rf',
tuneLength=1,
preProcess=c('BoxCox', 'center', 'scale')),
ldaModel = caretModelSpec(method='lda',
tuneLength=1,
preProcess=c('BoxCox', 'center', 'scale')),
logisticRegressionModel = caretModelSpec(method='glm',
tuneLength=1,
preProcess=c('BoxCox', 'center', 'scale'))
),
trControl = myTrainControl
)
The train control object I provided was as follows :
myTrainControl = trainControl(method = "cv",
number = 10,
index=createResample(training_input_data$retinopathy, 10),
savePredictions = TRUE,
classProbs = TRUE,
verboseIter = TRUE,
summaryFunction = twoClassSummary)
Now I am training on those list of models as :
ens <- caretEnsemble(model_list)
Applying summary
on ens
tells me the selected models (out of model_list
), weightage allocated to those selected models, out-of-sample AUC
values for each of the selected models, and finally in-sample AUC
values for ens
.
Now I want to compute the performance of ens
on other test-data (to get the idea about out-of-sample performance). How would I achieve it?
I am trying it out as :
ensPredictions <- predict(ens, newdata = test_data)
but it's giving me an error as :
Error in `[.data.frame`(out, , obsLevels, drop = FALSE) :
undefined columns selected
Upvotes: 3
Views: 1765
Reputation: 667
The first thing I'd check if the test set has all the features of your training set.
Upvotes: 1