runningbirds
runningbirds

Reputation: 6615

Multiple scoring metrics with sklearn xgboost gridsearchcv

How do I run a grid search with sklearn xgboost and get back various metrics, ideally at the F1 threshold value?

See my code below...can't find what I'm doing wrong/don't understand errors..

######################### just making up a dataset here##############
from sklearn import datasets

from sklearn.metrics import precision_score, recall_score, accuracy_score,   roc_auc_score, make_scorer
from sklearn.calibration import CalibratedClassifierCV, calibration_curve
from sklearn.model_selection import train_test_split
from sklearn.grid_search import RandomizedSearchCV

import xgboost as xgb

X, y = datasets.make_classification(n_samples=100000, n_features=20,
                                    n_informative=2, n_redundant=10,
                                    random_state=42)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.99,
                                                    random_state=42)

The rest is bunch of parameters and then a random grid search.... if I change 'SCORING_EVALS' to 'roc_auc' then it works...If I try and do what seems to be the documented approach for this I get an error? Where I am going wrong?

Additionally, how do I ensure that these metrics are reported at the F1 Threshold!?

params = {
    'min_child_weight': [0.5, 1.0, 3.0, 5.0, 7.0, 10.0],
    'gamma': [0, 0.25, 0.5, 1.0],
    'reg_lambda': [0.1, 1.0, 5.0, 10.0, 50.0, 100.0],
    "max_depth": [2,4,6,10],
    "learning_rate": [0.05,0.1, 0.2, 0.3,0.4],
    "colsample_bytree":[1, .8, .5],
    "subsample": [0.8],
    'reg_lambda': [0.1, 1.0, 5.0, 10.0, 50.0, 100.0],
            'n_estimators': [50]
}


folds = 5
max_models = 5

scoring_evals = {'AUC': 'roc_auc', 'Accuracy': make_scorer(accuracy_score), 'Precision': make_scorer(precision_score),'Recall': make_scorer(recall_score)}


xgb_algo = xgb.XGBClassifier()
random_search = RandomizedSearchCV(xgb_algo,
                                   param_distributions=params, n_iter=max_models, 
                                   scoring= scoring_evals, n_jobs=4, cv=5, verbose=False, random_state=2018 )

random_search.fit(X_train, y_train)

My errors are:

ValueError: scoring value should either be a callable, string or None. {'AUC': 'roc_auc', 'Accuracy': make_scorer(accuracy_score), 'Precision': make_scorer(precision_score), 'Recall': make_scorer(recall_score)} was passed

Upvotes: 1

Views: 3704

Answers (2)

Vivek Kumar
Vivek Kumar

Reputation: 36599

First check the version of scikit-learn you are using. If its v0.19 , then you are using the deprecated module.

You are doing this:

from sklearn.grid_search import RandomizedSearchCV

And you must have gotten a warning like:

DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. ... ... ...

The classes in the grid_search module are old and deprecated and dont contain the multi-metric functionality you are using.

Pay attention to that warning and do this:

from sklearn.model_selection import RandomizedSearchCV

...
...
...

random_search = RandomizedSearchCV(xgb_algo,
                               param_distributions=params,  
                               n_iter=max_models, 
                               scoring= scoring_evals, n_jobs=4, cv=5,  
                               verbose=False, random_state=2018, refit=False )

Now look closely at the refit param. In the multi-metric setting, you need to set this so that the final model can be fitted to that, because the best hyper-parameters for the model will be decided based on a single metric only.

You can either set it to False if you dont want the final model and only want the performance of the model on the data and different params or set that to any of the key you have in your scoring dict.

Upvotes: 2

Mischa Lisovyi
Mischa Lisovyi

Reputation: 3213

As the error suggests, and as the documentation of v0.18.2 states:

scoring : string, callable or None, default=None

one can not provided multiple metrics into scoring argument (in this scikit-learn version).

P.S. All functions that you tried to wrap into make_scorer are already predefined as standard scorers, so you can use their string names: see docs

EDITED: removed comment on usage of multiple metrics following criticism of Vivek

Upvotes: -1

Related Questions