Reputation: 113
I used the GridCV to do cross validation across k folds to tune my hyper parameters. The mean results which should have been mean over individual folds is wrong in my results attribute "cv_results_". Following is my code for same:
gscv = GridSearchCV(n_jobs=n_jobs,cv=train_test_iterable, estimator=pipeline, param_grid=param_grid,
verbose=10, scoring=['accuracy', 'precision','recall','f1'], refit='f1',
return_train_score=return_train_score, error_score=error_score,
)
gscv.fit(X,Y)
gscv.cv_results_
The cv_results_ contains following json(displayed as table)
mean_test_f1 split0_test_f1 split1_test_f1 Actual Mean
0.934310796 0.935603198 0.933665455 0.934634326
0.931279716 0.908430118 0.942689316 0.925559717
0.927683609 0.912005672 0.935512149 0.923758911
0.680908006 0.741198823 0.650802701 0.696000762
0.680908006 0.741198823 0.650802701 0.696000762
0.646005028 0.684483208 0.626791532 0.65563737
0.840273248 0.847484083 0.836672627 0.842078355
0.837160828 0.847484083 0.832006068 0.839745075
0.833637 0.842109375 0.829406448 0.835757911
You can see above: the "mean_test_f1" is not the mean of two folds "split0_test_f1", "split1_test_f1". Actual mean is the last column.
Note: F1 means the f1-score.
Did anyone face similar issues?
Upvotes: 2
Views: 266
Reputation: 36619
Try setting iid=False
in GridSearchCV(...)
and compare.
According to documentation:
iid : boolean, default=True If True, the data is assumed to be identically distributed across the folds, and the loss minimized is the total loss per sample, and not the mean loss across the folds.
So when iid
is True (by default), averaging of test scores include a weight as specified here in source code:
_store('test_%s' % scorer_name, test_scores[scorer_name],
splits=True, rank=True,
weights=test_sample_counts if iid else None)
Please note that train scores are not affected by it, so also cross-check the mean of train scores.
Upvotes: 1