Reputation: 51
I am trying to plot a learning curve to figure out whether my model is suffering from high bias, and to achieve this I would need to plot the training set errors versus the cross validation set errors. In Scikit Learn
is there a way to get this information?
rscv_rfc = grid_search.RandomizedSearchCV(RandomForestClassifier(), param_grid, n_jobs=4, cv=10)
rscv_rfc
gives me the best estimator etc., alongwith the best params for the model. Is there a way to receive the mean cv error from this object?
Upvotes: 0
Views: 140
Reputation: 14377
The docstring of RandomizedSearchCV
tells us that it exposes grid_scores_
containing all the scores it evaluated. However, these are all scores evaluated on held out data from splits of the training set.
Here is the place where the scores are actually evaluated. While the function _fit_and_score
actually has an option return_train_scores
, which you could set if you constructed your own grid search object, it is set to False
here and thus the training scores remain unaccessible.
I am wondering whether it would be useful in general to have this option propagate through into the *SearchCV
objects or not.
Upvotes: 1