Reputation: 109
I am trying to use 'AUCPR' as evaluation criteria for early-stopping using Sklearn's RandomSearchCV & Xgboost but I am unable to specify maximize=True
for early stopping fit params. Instead the eval_metric minimizes for AUCPR.
I have already referred to this question: GridSearchCV - XGBoost - Early Stopping
But it seems the early stopping works only for minimization objectives? The best iteration in early stopping is considered when AUCPR is the lowest which is not the correct optimization.
xgb = XGBClassifier()
params = {
'min_child_weight': [0.1, 1, 5, 10, 50],
'gamma': [0.5, 1, 1.5, 2, 5],
'subsample': [0.6, 0.8, 1.0],
'colsample_bytree': [0.6, 0.8, 1.0],
'max_depth': [5, 10, 25, 50],
'learning_rate': [0.0001, 0.001, 0.1, 1],
'n_estimators': [50, 100, 250, 500],
'reg_alpha': [0.0001, 0.001, 0.1, 1],
'reg_lambda': [0.0001, 0.001, 0.1, 1]
}
fit_params={"early_stopping_rounds":5,
"eval_metric" : "aucpr",
"eval_set" : [[X_val, y_val]]
}
random_search = RandomizedSearchCV(xgb,
cv=folds,
param_distributions=params,
n_iter=param_comb,
scoring=make_scorer(auc_precision_recall_curve, needs_proba=True),
n_jobs=10,
verbose=10,
random_state=1001,
)
random_search.fit(X_train, y_train, **fit_params)
Upvotes: 5
Views: 5662
Reputation: 109
It seems AUCPR maximize does not work for sklearn
https://github.com/dmlc/xgboost/issues/3712
Upvotes: 1