Understand SciKit Learn CV Validation Scores

Question

I'm trying to understand the output of cv_validation_scores, when running a GridSearchCV. The documentation does not adequately explain this.

When I print grid_search.grid_scores_, I get a list with items, like this:

[mean: 0.60000, std: 0.18002, params: {'tfidf__binary': True, tfidf__ngram_range': (1, 1)....

which makes sense. However, when I try to unpack each instance of grid_scores, I get:

[0] same dictionary as above, makes sense
[1] score for all folds, makes sense
[2] a list that I don't understand, that looks like, "[ 0.75        0.33333333  0.66666667]"

What are the scores being reported here?

Andreas Mueller · Accepted Answer

As I posted on the mailing list, the documentation makes this quite clear:

grid_scores_ : list of named tuples

Contains scores for all parameter combinations in param_grid. Each entry corresponds to one parameter setting. Each named tuple has the attributes:
        parameters, a dict of parameter settings
        mean_validation_score, the mean score over the cross-validation folds
        cv_validation_scores, the list of scores for each fold

These are the scores per fold in the cross-validation.

Understand SciKit Learn CV Validation Scores

Answers (2)

Related Questions