kitarp
kitarp

Reputation: 51

How does one find out the cv error from the sklearn package?

I am trying to plot a learning curve to figure out whether my model is suffering from high bias, and to achieve this I would need to plot the training set errors versus the cross validation set errors. In Scikit Learn is there a way to get this information?

rscv_rfc = grid_search.RandomizedSearchCV(RandomForestClassifier(), param_grid, n_jobs=4, cv=10)

rscv_rfc gives me the best estimator etc., alongwith the best params for the model. Is there a way to receive the mean cv error from this object?

Upvotes: 0

Views: 140

Answers (1)

eickenberg
eickenberg

Reputation: 14377

The docstring of RandomizedSearchCV tells us that it exposes grid_scores_ containing all the scores it evaluated. However, these are all scores evaluated on held out data from splits of the training set.

Here is the place where the scores are actually evaluated. While the function _fit_and_score actually has an option return_train_scores, which you could set if you constructed your own grid search object, it is set to False here and thus the training scores remain unaccessible.

I am wondering whether it would be useful in general to have this option propagate through into the *SearchCV objects or not.

Upvotes: 1

Related Questions