Reputation: 2859
I have a dataset like iris and my y is a multi-class factor variable. Is there any way to see the same results for method = rf
, method = treebag
, and method = boost
many thanks in advance.
data(iris); head(iris)
iris$Species <- factor(iris$Species)
set.seed(87)
inTrainingSet <- createDataPartition(iris$Species, p=.80,list=0)
train <- iris[inTrainingSet,]
test <- iris[-inTrainingSet,]
ctrl <- trainControl(method = "cv", number = 2, verboseIter = TRUE)
pls <- train(Species ~ Sepal.Length+Sepal.Width+Petal.Length+Petal.Width ,
method = "pls", data = iris,
trControl = ctrl)
attributes(varImp(pls))
varImp(pls)$importance
Upvotes: 2
Views: 869
Reputation: 46898
There's a few points to your question, so if there is a built in method to estimate this properly for each model, you can run varImp with the default useModel = FALSE
.
For randomforest, you add importance=TRUE
while fitting:
rf <- train(Species ~ Sepal.Length+Sepal.Width+Petal.Length+Petal.Width ,
method = "rf", data = iris,
trControl = ctrl,importance=TRUE)
varImp(rf)
rf variable importance
variables are sorted by maximum importance across the classes
setosa versicolor virginica
Petal.Length 66.94 100.00 85.40
Petal.Width 63.86 92.22 89.87
Sepal.Length 16.75 24.05 24.90
Sepal.Width 12.75 0.00 17.49
If the model does not have an inbuilt for multiclass, then the pairwise roc curve is used to derive these importances, see page for caret on the specificities on this:
tb <- train(Species ~ Sepal.Length+Sepal.Width+Petal.Length+Petal.Width ,
method = "treebag", data = iris,
trControl = ctrl,importance=TRUE)
varImp(tb,useModel=TRUE)
treebag variable importance
Overall
Petal.Length 100.00
Petal.Width 99.17
Sepal.Length 32.23
Sepal.Width 0.00
varImp(tb,useModel=FALSE)
ROC curve variable importance
variables are sorted by maximum importance across the classes
setosa versicolor virginica
Petal.Width 100.00 100.00 100.0
Petal.Length 100.00 100.00 100.0
Sepal.Length 90.70 59.30 90.7
Sepal.Width 54.59 54.59 0.0
You did not specify which boosted tree method use, but I guess you can easily use one of the options above
Upvotes: 2