Kamal Aujla
Kamal Aujla

Reputation: 327

"ValueError: could not convert string to float" while using OneHotEncoder for machine learning

I'm using LabelEncoder and OneHotEncoder to handle 'categorical data' in my dataset. In my data set there is a column which can have two values either 'Petrol' or 'Diesel' and I want to encode that column. I'm running this piece of code and its giving an error.

import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder,OneHotEncoder

dataset = pd.read_csv('ToyotaCorolla.csv')
X = dataset.iloc[:, 1:10].values
y = dataset.iloc[:, 0].values

labelencoder_X = LabelEncoder()
X[:, 3] = labelencoder_X.fit_transform(X[:, 3])
onehotencoder = OneHotEncoder(categorical_features = [3])
X = onehotencoder.fit_transform(X).toarray()

Column[3] is the one which will have categorical value. But it is showing up an error "ValueError: could not convert string to float: 'Diesel'". I dont know where I'm going wrong. please help. Thanks!

Upvotes: 5

Views: 10101

Answers (2)

raj kumar
raj kumar

Reputation: 41

this error comes when your x is having a column with categories in string format when I had had this error I used label encoder to all the categorical columns in X as you did to column 3 and then apply one hot encoder to column 3

"so what you have to do is LabelEncode all the categorical columns in X and then apply one hot encoder to your desired column"

Upvotes: 0

Juan Carlos Ramirez
Juan Carlos Ramirez

Reputation: 2129

categorical_features is deprecated, instead directly transform your categorical feature

onehotencoder = OneHotEncoder(categories='auto')
feature = onehotencoder.fit_transform(X[:, 3].reshape(-1, 1))

Upvotes: 5

Related Questions