pankaj
pankaj

Reputation: 460

How to print Recall and Accuracy along with Parameters used in a GridSearch in Sklearn?

I want to print the accuracy,recall along with each parameters used in Grid, How that can be done.

My Gridsearch code

from sklearn.grid_search import GridSearchCV
rf1=RandomForestClassifier(n_jobs=-1, max_features='sqrt') 
#fit_rf1=rf.fit(X_train_res,y_train_res)

# Use a grid over parameters of interest
param_grid = { 
           "n_estimators" : [50, 100, 150, 200],
           "max_depth" : [2, 5, 10],
           "min_samples_leaf" : [10,20,30]}




from sklearn.metrics import make_scorer
from sklearn.metrics import precision_score,recall_score
scoring = {'precision': make_scorer(precision_score), 'Recall': make_scorer(recall_score)}
    CV_rfc = GridSearchCV(estimator=rf1, param_grid=param_grid, cv= 10,scoring=scoring)
    CV_rfc.fit(X_train_res, y_train_res)

My Expected Output

{'max_depth': 10, 'min_samples_leaf': 2, 'n_estimators': 50,'accuracy':.97,'recall':.89}
{'max_depth': 5, 'min_samples_leaf':10 , 'n_estimators': 100,'accuracy':.98,'recall':.92}

Upvotes: 4

Views: 2111

Answers (1)

KPLauritzen
KPLauritzen

Reputation: 1869

If you set scoring as a list of scorers, you can get the mean score for each scorer in CV_rfc.cv_results_.

For example:

from sklearn.datasets import make_classification
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
X, y = make_classification()
base_clf = RandomForestClassifier()
param_grid = { 
           "n_estimators" : [50, 100, 150, 200],}
CV_rf = GridSearchCV(base_clf, param_grid, scoring=['accuracy', 'roc_auc'], refit=False)
CV_rf.fit(X, y)

print(CV_rf.cv_results_)

and you get output like:

{'mean_fit_time': array([ 0.05867839,  0.10268728,  0.15536443,  0.19937317]),
 'mean_score_time': array([ 0.00600123,  0.01033529,  0.0146695 ,  0.02000403]),
 'mean_test_accuracy': array([ 0.9 ,  0.91,  0.89,  0.91]),
 'mean_test_roc_auc': array([ 0.91889706,  0.94610294,  0.94253676,  0.94308824]),
 'mean_train_accuracy': array([ 1.,  1.,  1.,  1.]),
 'mean_train_roc_auc': array([ 1.,  1.,  1.,  1.]),
 [...]
 }

So the mean_test_[scoring] is what you are after. Note that you can import cv_results_ as a Pandas DataFrame. That helps readability a lot!

Upvotes: 1

Related Questions