Reputation: 1
I am trying to encode a categorical variable, but it gives me an error of one the arguments of the 'OneHotEncoder' function. I think this is because the argument has changed into "categories", but now I do not know how to encode this categorical variable.
This is my code:
#importing libraries
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
#importing the dataset
dataset = pd.read_csv('50_Startups.csv')
X = dataset.iloc[:, : -1].values
y = dataset.iloc[:, 4].values
#encoding categorical data, variables that contain categories
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:,3] = labelencoder_X.fit_transform(X[:, 3])
onehotencoder = OneHotEncoder(categorical_features [3] )
X = onehotencoder.fit_transform(X).toarray()
NameError: name 'categorical_features' is not defined
How can I manage to encode the categorical variable named 'State'?
Upvotes: -1
Views: 3209
Reputation: 1
The variable categorical_features does not exists. It is removed from libraries.
You can use ColumnTransformer from sklearn.compose
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
columnTransformer=ColumnTransformer([('encoder',OneHotEncoder(),[3])],remainder='passthrough')
x=np.array(columnTransformer.fit_transform(x),dtype=str)
Upvotes: 0