saurabh kumar
saurabh kumar

Reputation: 21

Does my sklearn pipeline also scale my dependent variables y?

My confusion is about pipeline. Suppose my code is

pipe=Pipeline([('sc',StandardScaler()),
               ('pca',PCA(n_components=2)),
               ('lr',LinearRegression())])

and i called pipe.fit(X_train,y_train). Does this also scale the y_train values?

Upvotes: 2

Views: 706

Answers (2)

maxi.marufo
maxi.marufo

Reputation: 406

No, it does not. Pipeline sequentially applies the fit method and then the transform method to each of the steps, except for the last one, which only needs the fit method. Your first two classes in the pipeline are StandardScaler and PCA, and both of them apply the fit method ignoring the y_train values, therefore, they only depend on the X_train data. For the final step, LinearRegression will receive the transformed X_train values, and will call the fit method with them, but also with the original y_train values.

Upvotes: 1

Yoskutik
Yoskutik

Reputation: 2099

No, it doesn't. If pipeline scaled labels too, you would get scaled predictions as well.

Upvotes: 1

Related Questions