Yoann Brenet
Yoann Brenet

Reputation: 31

Use of Scaler with LassoCV, RidgeCV

I would like to use scikit-learn LassoCV/RidgeCV while applying a 'StandardScaler' on each fold training set. I do not want to apply the scaler before the cross-validation to avoid leakage but I cannot figure out how I am supposed to do that with LassoCV/RidgeCV.

Is there a way to do this ? Or should I create a pipeline with Lasso/Ridge and 'manually' search for the hyperparameters (using GridSearchCV for instance) ?

Many thanks.

Upvotes: 1

Views: 1129

Answers (2)

seralouk
seralouk

Reputation: 33147

  • If you want to apply the scaling to each iteration in cross-validation, you could use the make_pipeline function (this function will call "fit" on each training fold and call "transform" on each test fold)

  • The make_my_pipe below can be considered as an esitmator with a StandardScaler attached to it.

code:

from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.cross_validation import cross_val_score
from sklearn.linear_model import Ridge

X = "some data"
y = "the labels of the data"

make_my_pipe = make_pipeline(StandardScaler(), Ridge())
scores = cross_val_score(pipe, X, y)

print(scores)

Upvotes: 0

Yoann Brenet
Yoann Brenet

Reputation: 31

I got the answer through the scikit-learn mailing list so here it is:

'There is no way to use the "efficient" EstimatorCV objects with pipelines. This is an API bug and there's an open issue and maybe even a PR for that.'

Many thanks to Andreas Mueller for the answer.

Upvotes: 2

Related Questions