Reputation: 351
I'm trying to tune the parameters for a GLM boost model. According to the Caret package documentation concerning this model there is 2 parameters that can be adjusted, mstop and prune.
library(caret)
library(mlbench)
data(Sonar)
set.seed(25)
trainIndex = createDataPartition(Sonar$Class, p = 0.9, list = FALSE)
training = Sonar[ trainIndex,]
testing = Sonar[-trainIndex,]
### set training parameters
fitControl = trainControl(method = "repeatedcv",
number = 10,
repeats = 10,
## Estimate class probabilities
classProbs = TRUE,
## Evaluate a two-class performances
## (ROC, sensitivity, specificity) using the following function
summaryFunction = twoClassSummary)
### train the models
set.seed(69)
# Use the expand.grid to specify the search space
glmBoostGrid = expand.grid(mstop = c(50, 100, 150, 200, 250, 300),
prune = c('yes', 'no'))
glmBoostFit = train(Class ~ .,
data = training,
method = "glmboost",
trControl = fitControl,
tuneGrid = glmBoostGrid,
metric = 'ROC')
glmBoostFit
The output is the following:
Boosted Generalized Linear Model
188 samples
60 predictors
2 classes: 'M', 'R'
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 169, 169, 169, 169, 170, 169, ...
Resampling results across tuning parameters:
mstop ROC Sens Spec ROC SD Sens SD Spec SD
50 0.8261806 0.764 0.7598611 0.10208114 0.1311104 0.1539477
100 0.8265972 0.729 0.7625000 0.09459835 0.1391250 0.1385465
150 0.8282083 0.717 0.7726389 0.09570417 0.1418152 0.1382405
200 0.8307917 0.714 0.7769444 0.09484042 0.1439011 0.1452857
250 0.8306667 0.719 0.7756944 0.09452604 0.1436740 0.1535578
300 0.8278403 0.728 0.7722222 0.09794868 0.1425398 0.1576030
Tuning parameter 'prune' was held constant at a value of yes
ROC was used to select the optimal model using the largest value.
The final values used for the model were mstop = 200 and prune = yes.
The prune parameter is kept constant (Tuning parameter 'prune' was held constant at a value of yes
) although the glmBoostGrid
contains also prune == no
. I took a look at the mboost
package documentation at the boost_control
method and only the mstop
parameter is accessible, so how can the prune
parameter be tuned with the tuneGrid
parameter of the train
method?
Upvotes: 4
Views: 3459
Reputation: 23608
The difference is loceted in this part of the calls for glmboost:
if (param$prune == "yes") {
out <- if (is.factor(y))
out[mstop(AIC(out, "classical"))]
else out[mstop(AIC(out))]
}
The difference lies in how the aic is calculated. But running diverse tests with glmboost in caret I have my doubts if it is behaving as expected. I have created an issue in github to see if my suspicions are correct. I'll edit my answer if there is more information from the developers.
Upvotes: 3