Reputation: 157
I've been trying to use an Ensemble method in R and was trying out the ensembles of caret models on the BostonHousing2 dataset. While setting up the greedy ensemble as well as the linear ensemble, I get an error. The code is as follows:
library(caret)
library(caretEnsemble)
library(mlbench)
data(BostonHousing2)
X <- model.matrix(cmedv~crim+zn+indus+chas+nox+rm+age+dis+
rad+tax+ptratio+b+lstat+lat+lon, BostonHousing2)[,-1]
X <- data.frame(X)
Y <- BostonHousing2$cmedv
train <- runif(nrow(X)) <= 0.7
folds=5
repeats=1
myControl <- trainControl(method='cv', number=folds, repeats=repeats, returnResamp='none',
returnData=FALSE, savePredictions=TRUE,
verboseIter=TRUE, allowParallel=TRUE,
index=createMultiFolds(Y[train], k=folds, times=repeats))
PP <- c('center', 'scale')
model1 <- train(X[train,], Y[train], method='gbm', trControl=myControl,
tuneGrid=expand.grid(.n.trees=500, .interaction.depth=15, .shrinkage = 0.01))
model2 <- train(X[train,], Y[train], method='blackboost', trControl=myControl)
model3 <- train(X[train,], Y[train], method='parRF', trControl=myControl)
model4 <- train(X[train,], Y[train], method='mlpWeightDecay', trControl=myControl, trace=FALSE, preProcess=PP)
model5 <- train(X[train,], Y[train], method='ppr', trControl=myControl, preProcess=PP)
model6 <- train(X[train,], Y[train], method='earth', trControl=myControl, preProcess=PP)
model7 <- train(X[train,], Y[train], method='glm', trControl=myControl, preProcess=PP)
model8 <- train(X[train,], Y[train], method='svmRadial', trControl=myControl, preProcess=PP)
model9 <- train(X[train,], Y[train], method='gam', trControl=myControl, preProcess=PP)
model10 <- train(X[train,], Y[train], method='glmnet', trControl=myControl, preProcess=PP)
all.models <- list(model1, model2, model3, model4, model5, model6, model7, model8, model9, model10)
names(all.models) <- sapply(all.models, function(x) x$method)
sort(sapply(all.models, function(x) min(x$results$RMSE)))
greedy <- caretEnsemble(all.models, iter=1000L)
Error: is(list_of_models, "caretList") is not TRUE
There are a couple of instances where I am stuck: while setting up model1, I get the following error message:
The tuning parameter grid should have columns n.trees, interaction.depth, shrinkage, n.minobsinnode
Also while setting up the greedy ensemble as well as the linear ensemble, I get a list error when combining the models. Request some assistance please.
PS: Apologize, if these needed to be separate questions.
Upvotes: 0
Views: 3749
Reputation: 21
It works if we convert the 'all.models' which is a normal list to a Caret_List using the code -
class(all.models) <- "caretList"
Thanks!
Upvotes: 0
Reputation: 60
Answering the second part:
The updated version of method="gbm" requires presence of n.trees, interaction.depth, shrinkage, n.minobsinnode in tuneGrid.
tuneGrid=expand.grid(.n.trees=500, .interaction.depth=6, .shrinkage = 0.01,.n.minobsinnode = c(10))
would work just fine in the case above.
Upvotes: 1
Reputation: 581
You must use the function caretlist()
in order to create an a caret-model-list which can be passed to caretEnsemble
, as seen in the vignette example below:
model_list <- caretList(
Class~., data=training,
trControl=my_control,
methodList=c("glm", "rpart")
)
Answering your second question, what do you expect the following line to do?
expand.grid(.n.trees=500, .interaction.depth=15, .shrinkage = 0.01)
You can test that it is just a single combination of three values. You should have atleast two values in any of the columns to generate more than 1 parameter value combinations to tune on.Also, the why do the names have an additional "." (dot) at the beginning?
Upvotes: 2