Rafsan Sadman
Rafsan Sadman

Reputation: 195

TypeError: __init__() got an unexpected keyword argument 'categorical_features'

Spyder(python 3.7)

I am facing following errors here. I have already update all library from anaconda prompt. But can't findout the solution of the problem.

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()

X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
Traceback (most recent call last):

File "<ipython-input-4-05deb1f02719>", line 2, in <module>
onehotencoder = OneHotEncoder(categorical_features = [1])

TypeError: __init__() got an unexpected keyword argument 'categorical_features'

Upvotes: 16

Views: 75696

Answers (13)

Praveen Inbavadivu
Praveen Inbavadivu

Reputation: 1

This is the latest solution for the categorical_feature error.

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
Label_x = LabelEncoder()
x[:,0] = Label_x.fit_transform(x[:,0])

onehotencoder = OneHotEncoder.categorical_features =[0]
OneHotEncoder().fit_transform(x).toarray()
x = x[:, 1:]

Upvotes: 0

Abhishek Trivedi
Abhishek Trivedi

Reputation: 91

one_hot_encode = OneHotEncoder(categorical_features=[0]) is working for scikit-learn 0.20.3 and the parameter removed from scikit-learn 0.24.2 (versions I am checking).

Either Downgrade scikit-learn version Or Use

from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer

"""2 classes- Known/unknown Face"""
ct = ColumnTransformer([("Faces", OneHotEncoder(), [0])], remainder = 'passthrough')
X = ct.fit_transform(X)

"""Country column"""
ct = ColumnTransformer([("Country", OneHotEncoder(), [1])], remainder = 'passthrough')
X = ct.fit_transform(X)```

Upvotes: 0

VincentP
VincentP

Reputation: 89

Another solution including the transformation of the X object in array type in a float64 type

from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X), dtype=np.float)

Upvotes: 0

Spidy ben parker
Spidy ben parker

Reputation: 41

# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([("Geography",OneHotEncoder(),[1])], remainder= 'passthrough')
X = ct.fit_transform(X)
labelencoder_X2 = LabelEncoder()
X[:, 4] = labelencoder_X2.fit_transform(X[:, 4])
X = X[: , 1:]

X = np.array(X, dtype=float)

Just adding an extra line to convert it from array of objects.

Upvotes: 2

Replace the following code

# onehotencoder = OneHotEncoder(categorical_features = [1])
# X = onehotencoder.fit_transform(X).toarray()
# X = X[:, 1:]

with the following chunk and your code must

labelencoder_X_2 = LabelEncoder()

X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])

columnTransformer = ColumnTransformer([('encoder', OneHotEncoder(), [1])], remainder = 'passthrough')
X = np.array(columnTransformer.fit_transform(X), dtype = np.float64)
X = X[:, 1:]

Assuming you're learning Deep Learning from udemy.

Upvotes: 1

carol
carol

Reputation: 1

Here is only one extension for onehotencoder. if X have lot of columns.

instead

ct = ColumnTransformer([("encoder", OneHotEncoder(), list(categorical_features))], remainder = 'passthrough')

X = ct.fit_transform(X)

Upvotes: 0

Dear_Gabe
Dear_Gabe

Reputation: 41

You need to add call another class on sklearn which will eliminate 1 column to avoid dummies trap.

# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer # Here is the one

labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])

labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])

#onehotencoder = OneHotEncoder(categorical_features = [1]) Not this one

# use this instead
ct = ColumnTransformer([("Country", OneHotEncoder(), [1])], remainder = 'passthrough')

X = ct.fit_transform(X)
X = X[:, 1:])

Happy Helping!!!

Upvotes: 2

Shomnath Somu
Shomnath Somu

Reputation: 3

# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])

# remove categorical_features, it works 100% perfectly
onehotencoder = OneHotEncoder()
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]

Upvotes: 0

Pam Cesar
Pam Cesar

Reputation: 73

    from sklearn.preprocessing import OneHotEncoder
    from sklearn.compose import ColumnTransformer
    columnTransformer = ColumnTransformer([('encoder', OneHotEncoder(), [0])],     remainder='passthrough')
    X=np.array(columnTransformer.fit_transform(X),dtype=np.str)

Since the latest build of sklearn library removed categorical_features parameter for onehotencoder class. It is advised to use ColumnTransformer class for categorical datasets. Refer the sklearn's official documentation for futher clarifications.

Upvotes: 7

RaviSole
RaviSole

Reputation: 21

Assuming this is problem from ML course from Udemy complete code I did replaced label encoder 1 with column transformer as suggested by Antoine Jaussoin in above comment.

Categorical Data

from sklearn.preprocessing import LabelEncoder,OneHotEncoder
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([("Geography", OneHotEncoder(), [1])], remainder = 'passthrough')
X = ct.fit_transform(X)

Your Gender column will have index 4 now

 labelencoder_x_2=LabelEncoder()
 X[:,4]=labelencoder_x_2.fit_transform(X[:,4])

to avoid dummy variable trap

 X=X[:, 1:]

Upvotes: 2

Mahesh C. Regmi
Mahesh C. Regmi

Reputation: 81

from sklearn.preprocessing import OneHotEncoder, LabelEncoder

from sklearn.compose import ColumnTransformer

label_encoder_x_1 = LabelEncoder()
X[: , 2] = label_encoder_x_1.fit_transform(X[:,2])
transformer = ColumnTransformer(
    transformers=[
        ("OneHot",        # Just a name
         OneHotEncoder(), # The transformer class
         [1]              # The column(s) to be applied on.
         )
    ],
    remainder='passthrough' # donot apply anything to the remaining columns
)
X = transformer.fit_transform(X.tolist())
X = X.astype('float64')

working like charm :)

Upvotes: 4

Antoine Jaussoin
Antoine Jaussoin

Reputation: 5182

So based on your code, you'd have to:

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer

# Country column
ct = ColumnTransformer([("Country", OneHotEncoder(), [1])], remainder = 'passthrough')
X = ct.fit_transform(X)

# Male/Female
labelencoder_X = LabelEncoder()
X[:, 2] = labelencoder_X.fit_transform(X[:, 2])

Noticed how the first LabelEncoder was removed, you do not need to apply both the label encoded and the one hot encoder on the column anymore.

(I've kinda assumed your example came from the ML Udemy course, and the first column was a list of countries, while the second one a male/female binary choice)

Upvotes: 30

Amiram
Amiram

Reputation: 1305

According to the documentation this is the __init__ line:

class sklearn.preprocessing.OneHotEncoder(categories='auto', drop=None, sparse=True, dtype=<class 'numpy.float64'>, handle_unknown='error')

As you can see the init does not get the variable categorical_features

You have an categories flag:

categories‘auto’ or a list of array-like, default=’auto’ Categories (unique values) per feature:

‘auto’ : Determine categories automatically from the training data.

list : categories[i] holds the categories expected in the ith column. The passed categories should not mix strings and numeric values within a single feature, and should be sorted in case of numeric values.

The used categories can be found in the categories_ attribute.

Attributes: categories_list of arrays The categories of each feature determined during fitting (in order of the features in X and corresponding with the output of transform). This includes the category specified in drop (if any).

Upvotes: 3

Related Questions