Reputation: 171
How do i obtain X_train and y_train separately after transforming the data
Code
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.preprocessing import StandardScaler
DATA=pd.read_csv("/storage/emulated/0/Download/iris-write-from-docker.csv")
X = DATA.drop(["class"], axis = 'columns')
y = DATA["class"].values
X_train, X_test, y_train, y_test=train_test_split(X,y,test_size=0.25,random_state = 42)
pipe=Pipeline(steps=[('clf',StandardScaler())])
dta=pipe.fit_transform(X_train,y_train)
print(dta)
#print(X_train,y_train) from dta
I want to obtain transformed X_train
and y_train
from dta
Upvotes: 0
Views: 985
Reputation: 5324
The output of fit_transform()
is the transformed version of X_train
. y_train is not used during the fit_transform() of your pipeline.
Therefore you can simply do as follows to retrieve the transformed X_train
as y_train
remains the same:
pipe=Pipeline(steps=[('clf',StandardScaler())])
X_train_scaled = pipe.fit_transform(X_train)
Upvotes: 2