Reputation: 21
My confusion is about pipeline. Suppose my code is
pipe=Pipeline([('sc',StandardScaler()),
('pca',PCA(n_components=2)),
('lr',LinearRegression())])
and i called pipe.fit(X_train,y_train)
. Does this also scale the y_train
values?
Upvotes: 2
Views: 706
Reputation: 406
No, it does not.
Pipeline
sequentially applies the fit
method and then the transform
method to each of the steps, except for the last one, which only needs the fit
method. Your first two classes in the pipeline are StandardScaler
and PCA
, and both of them apply the fit
method ignoring the y_train
values, therefore, they only depend on the X_train
data. For the final step, LinearRegression
will receive the transformed X_train
values, and will call the fit
method with them, but also with the original y_train
values.
Upvotes: 1
Reputation: 2099
No, it doesn't. If pipeline scaled labels too, you would get scaled predictions as well.
Upvotes: 1