Xus
Xus

Reputation: 198

Transform labels back to original encoding

I have a table like this:

           exterior_color interior_color  ... isTheftRecovered    price
0            Night Black        Unknown  ...                0  16995.0
1    Orca Black Metallic        Unknown  ...                0  17995.0
2        Brilliant Black        Unknown  ...                0   9995.0
3  Azores Green Metallic   Nougat Brown  ...                0  24495.0
4                  Black          Brown  ...                0  16990.0

code:

from sklearn.preprocessing import LabelEncoder
from sklearn import tree
import pandas as pd
import numpy as np


data_frame = pd.read_csv()


le = LabelEncoder()


cols = ['exterior_color', 'interior_color', 'location', 'make', 'model', 'mileage', 'style', 'year', 'engine', 'accidentCount',
        'accidentCount', 'ownerCount', 'isCleanTitle', 'isFrameDamaged', 'isLemon', 'isSalvage', 'isTheftRecovered', 'price']


data_frame[cols] = data_frame[cols].apply(LabelEncoder().fit_transform)


exclude_price = data_frame[data_frame.columns.difference(['price'])]


clf = tree.DecisionTreeClassifier()
clf = clf.fit(exclude_price, data_frame.price)

my_data = ['Night Black', 'Unknown', 'Patchogue, NY', 'Audi', 'Q7', '5000', 'S-line 3.0T quattro',
           '2015', '2.0L Inline-4 Gas Turbocharged', '0', '5.0', '1', '1', '0', '0', '1']

new_data = le.fit_transform(my_data)

answer = clf.predict([new_data])

print(f"Car's price has been predicted ${answer[0]}")

This code is going to do labelencoding the dataframe and then predict the price of the given data(s) but I can not transform labels back to original encoding and use inverse_transform to show the actual price

Upvotes: 1

Views: 982

Answers (2)

Ranjeeth Rikkala
Ranjeeth Rikkala

Reputation: 1

answer = le.inverse_transform(answer)

Upvotes: 0

tdy
tdy

Reputation: 41327

By encoding as apply(LabelEncoder().fit_transform), you lose access to the encoder objects. Instead you can save them in an encoder dictionary keyed by column name:

from collections import defaultdict
encoder = defaultdict(LabelEncoder)

df[cols] = df[cols].apply(lambda x: encoder[x.name].fit_transform(x))

And then decode the final price via encoder['price']:

decoded = encoder['price'].inverse_transform(answer)[0]
print(f"Car's price has been predicted as ${decoded:.2f}")

# Car's price has been predicted as $16995.00

Upvotes: 4

Related Questions