Ryo
Ryo

Reputation: 167

Python - k fold cross validation for linear_model.Lasso

I have a following code using linear_model.Lasso:

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X,y,test_size=0.2)
clf = linear_model.Lasso()
clf.fit(X_train,y_train)
accuracy = clf.score(X_test,y_test)
print(accuracy)

I want to perform k fold (10 times to be specific) cross_validation. What would be the right code to do that?

Upvotes: 4

Views: 12272

Answers (2)

Espoir Murhabazi
Espoir Murhabazi

Reputation: 6376

here is the code I use to perform cross validation on a linear regression model and also to get the details:

from sklearn.model_selection import cross_val_score
scores = cross_val_score(clf, X_Train, Y_Train, scoring="neg_mean_squared_error", cv=10)
rmse_scores = np.sqrt(-scores)

As said in this book at page 108 this is the reason why we use -score:

Scikit-Learn cross-validation features expect a utility function (greater is better) rather than a cost function (lower is better), so the scoring function is actually the opposite of the MSE (i.e., a negative value), which is why the preceding code computes -scores before calculating the square root.

and to visualize the result use this simple function:

def display_scores(scores):
    print("Scores:", scores)
    print("Mean:", scores.mean())
    print("Standard deviation:", scores.std())

Upvotes: 5

Elisha
Elisha

Reputation: 23770

You can run 10-fold using the model_selection module:

# for 0.18 version or newer, use:
from sklearn.model_selection import cross_val_score

# for pre-0.18 versions of scikit, use:
from sklearn.cross_validation import cross_val_score

X = # Some features
y = # Some classes

clf = linear_model.Lasso()
scores = cross_val_score(clf, X, y, cv=10)

This code will return 10 different scores. You can easily get the mean:

scores.mean()

Upvotes: 2

Related Questions