Reputation: 1
Whenever i am trying to execute the following code it is showing ValueError: y contains previously unseen labels: 'some_label'
X_test['Gender'] = le.transform(X_test['Gender'])
X_test['Age'] = le.transform(X_test['Age'])
X_test['City_Category'] = le.transform(X_test['City_Category'])
X_test['Stay_In_Current_City_Years']
=le.transform(X_test['Stay_In_Current_City_Years'])
Upvotes: 0
Views: 2002
Reputation: 1625
I am not really sure what is your whole code is but I think the problem is your train data is different from your test data, meaning when you are using "transform" there is some data point in test that was not available while you fit your transformer on "Train" data.
Lets see it with an example. If you notice I have fitted (trained) my ColumnTransformer with OneHotEncoder on train data and when I will use it to transform my test data it will through an error because it has never seen value Z which is present in test but not in train dataset :
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import make_column_transformer
df = pd.DataFrame(['a','b','c','a','b','z'], columns=['c1'])
train = df[:3]
test = df[3:]
cl = make_column_transformer((OneHotEncoder(),train.columns))
cl.fit(train)
cl.transform(test)
This will through below error:
ValueError: Found unknown categories ['z'] in column 0 during transform
Upvotes: 1