Reputation: 11
I face a problem concerning the LabelEncoder. I applied it to a data set as follows:
data_set1 = data_set.apply(LabelEncoder().fit_transform)
... and it worked. However, now I want to get the mapping of the LabelEncoder. Therefore I used the following:
le = preprocessing.LabelEncoder()
le.fit(data_set1['column'])
le_name_mapping = dict(zip(le.classes_, le.transform(le.classes_)))
print(le_name_mapping)
I was expecting a dictionary that would look like the following:
{apple: 0, banana: 1, kiwi: 2}
and so on... Instead the output was the following:
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
Do you guys have any idea why and how to fix it?
Upvotes: 1
Views: 739
Reputation: 341
I think this simple piece of code:
data = ['apple', 'banana', 'kiwi', 'apple']
le = LabelEncoder()
le.fit(data)
le.classes_
outputs what you want: array(['apple', 'banana', 'kiwi'], dtype='<U6')
. The first item corresponds to label 0, the second one is labelled as 1, etc.
If you want the corresponding dictionary you can get it with labels_dict = {index: value for index, value in enumerate(le.classes_)}
, such that labels_dict
is {0: 'apple', 1: 'banana', 2: 'kiwi'}
.
Upvotes: 4