Soyol
Soyol

Reputation: 793

GridSearchCV with Scoring Function and Refit Parameter

My question seems to be similar to this one but there is no solid answer there.

I'm doing a multi-class multi-label classification, and for doing that I have defined my own scorers. However, in order to have the refit parameter and get the best parameters of the model at the end we need to introduce one of the scorer functions for the refit. If I do so, I get the error that missing 1 required positional argument: 'y_pred'. y_pred should be the outcome of fit. But not sure where this issue is coming from and how I can solve it.

Below is the code:

scoring = {'roc_auc_score':make_scorer(roc_auc_score),
          'precision_score':make_scorer(precision_score, average='samples'),
          'recall_score':make_scorer(recall_score, average='samples')}

params = {'estimator__n_estimators': [500,800],
          'estimator__max_depth': [10,50],}

model = xgb.XGBClassifier(n_jobs=4)
model = MultiOutputClassifier(model)

cls = GridSearchCV(model, params, cv=3, refit=make_scorer(roc_auc_score), scoring = scoring, verbose=3, n_jobs= -1)

model = cls.fit(x_train_ups, y_train_ups)
print(model.best_params_)

Upvotes: 1

Views: 2756

Answers (1)

Ben Reiniger
Ben Reiniger

Reputation: 12582

You should use refit="roc_auc_score", the name of the scorer in your dictionary. From the docs:

For multiple metric evaluation, this needs to be a str denoting the scorer that would be used to find the best parameters for refitting the estimator at the end.

Using a callable for refit has a different purpose: the callable should take the cv_results_ dict and return the best_index_. That explains the error message: sklearn is trying to pass cv_results_ to your auc scorer function, but that function should take parameters y_true and y_pred.

Upvotes: 5

Related Questions