Reputation: 996
I try to optimize the averaged prediction of two logistic regressions in a classification task using a superlearner.
My measure of interest is classif.auc
The mlr3
help file tells me (?mlr_learners_avg
)
Predictions are averaged using weights (in order of appearance in the data) which are optimized using nonlinear optimization from the package "nloptr" for a measure provided in measure (defaults to classif.acc for LearnerClassifAvg and regr.mse for LearnerRegrAvg). Learned weights can be obtained from $model. Using non-linear optimization is implemented in the SuperLearner R package. For a more detailed analysis the reader is referred to LeDell (2015).
I have two questions regarding this information:
When I look at the source code I think LearnerClassifAvg$new()
defaults to "classif.ce"
, is that true?
I think I could set it to classif.auc
with param_set$values <- list(measure="classif.auc",optimizer="nloptr",log_level="warn")
The help file refers to the SuperLearner
package and LeDell 2015. As I understand it correctly, the proposed "AUC-Maximizing Ensembles through Metalearning" solution from the paper above is, however, not impelemented in mlr3
? Or do I miss something? Could this solution be applied in mlr3
? In the mlr3
book I found a paragraph regarding calling an external optimization function, would that be possible for SuperLearner
?
Upvotes: 0
Views: 248
Reputation: 321
As far as I understand it, LeDell2015 proposes and evaluate a general strategy that optimizes AUC as a black-box function by learning optimal weights. They do not really propose a best strategy or any concrete defaults so I looked into the defaults of the SuperLearner package's AUC optimization strategy.
Assuming I understood the paper correctly:
The LearnerClassifAvg
basically implements what is proposed in LeDell2015 namely, it optimizes the weights for any metric using non-linear optimization. LeDell2015 focus on the special case of optimizing AUC. As you rightly pointed out, by setting the measure to "classif.auc"
you get a meta-learner that optimizes AUC. The default with respect to which optimization routine is used deviates between mlr3pipelines and the SuperLearner package, where we use NLOPT_LN_COBYLA
and SuperLearner ... uses the Nelder-Mead method via the optim
function to minimize rank loss (from the documentation).
So in order to get exactly the same behaviour, you would need to implement a Nelder-Mead
bbotk::Optimizer
similar to here that simply wraps stats::optim
with method Nelder-Mead
and carefully compare settings and stopping criteria. I am fairly confident that NLOPT_LN_COBYLA
delivers somewhat comparable results, LeDell2015 has a comparison of the different optimizers for further reference.
Thanks for spotting the error in the documentation. I agree, that the description is a little unclear and I will try to improve this!
Upvotes: 1