Iwan Thomas
Iwan Thomas

Reputation: 313

Computing training score using cross_val_score

I am using cross_val_score to compute the mean score for a regressor. Here's a small snippet.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score 

cross_val_score(LinearRegression(), X, y_reg, cv = 5)

Using this I get an array of scores. I would like to know how the scores on the validation set (as returned in the array above) differ from those on the training set, to understand whether my model is over-fitting or under-fitting.

Is there a way of doing this with the cross_val_score object?

Upvotes: 7

Views: 4802

Answers (2)

FooBee
FooBee

Reputation: 798

You can use cross_validate instead of cross_val_score
according to doc:

The cross_validate function differs from cross_val_score in two ways -

  • It allows specifying multiple metrics for evaluation.
  • It returns a dict containing training scores, fit-times and score-times in addition to the test score.

Upvotes: 21

E.Z
E.Z

Reputation: 1998

Why would you want that? cross_val_score(cv=5) does that for you as it splits your train data 10 times and verifies accuracy scores on 5 test subsets. This method already serves as a way to prevent your model from over-fitting.

Anyway, if you are eager to verify accuracy on your validation data, then you have to fit your LinearRegression first on X and y_reg.

Upvotes: -5

Related Questions