Atten12
Atten12

Reputation: 21

I want to use OneHotEncoder in Single Categorical column

Here shape of df is (190,2) where 1st column is x and is a categorical value and @nd column is Integer.

X = df.iloc[:,0].values
y = df.iloc[:,-1].values

# Encoding categorical data

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X = labelencoder.fit_transform(X)
X.reshape(-1,1)
onehotencoder = OneHotEncoder(categories = [0])
X = onehotencoder.fit_transform(X).toarray()

Here I wanted to change the Categorical value X using OneHotEncoder to predict y. But When I run this code, I am getting an error.

ValueError: bad input shape ()

Can someone help me to resolve this issue. Thanks

Upvotes: 1

Views: 58

Answers (1)

yatu
yatu

Reputation: 88226

Currently OneHotEncoder does not require for the input features to be numerical. So you can just feed it directly the categorical features:

onehotencoder = OneHotEncoder()
X_oh = onehotencoder.fit_transform(X).toarray()

In the case of having a 1D array, as is usually the case of y, you'll need to reshape the array into a 2D one:

onehotencoder = OneHotEncoder()
X_oh = onehotencoder.fit_transform(X.reshape(-1,1)).toarray()

Do note however that the following:

X.reshape(-1,1)

Is not doing anything. It is not performing an in-place operation. You have to assign it back to a variable.

Upvotes: 1

Related Questions