How can i apply onehotencoder to one column of an array?

Question

I've been following a tutorial trying to understand machine learning while trying out what he's doing at the same time.

My array is:

0   44                      72000
2   27                      48000
1   30                      54000
2   38                      61000
1   40                      6.377777777777778101
0   35                      58000
2   38.77777777777777857    52000
0   48                      79000
1   50                      83000
0   37                      67000

The first column used to contain country name but he used label encoder to transform it to 0s,1s and 2s.

He wanted to also use OneHotEncoder to transform that column to more features but since his videos are a bit outdated he used categorical_features with OneHotEncoder but in my sklearn version OneHotEncoder has been changed and i don't have that parameter anymore.

So how can I use OneHotEncoder now on that specific feature?

What he tried was:

onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()

insomaniac79 · Accepted Answer

Assuming that your data X has a shape (n_rows, features). If you like to apply one-hot encoding to say, the first column. A quick approach would be

onehotencoder = OneHotEncoder()
one_hot = onehotencoder.fit_transform(X[:,0:1]).toarray()

A better approach to apply one-hot encoding only a specific column would be to use ColumnTransformer

from sklearn.compose import ColumnTransformer

ct = ColumnTransformer([("country", OneHotEncoder(), [0])], remainder = 'passthrough')
X = ct.fit_transform(X)

How can i apply onehotencoder to one column of an array?

Answers (2)

Related Questions