Reputation: 81
I am new at coding with sklearn, I need to encode 3 columns of my dtaset, I tried encoding only one column but it sent me an error
*ValueError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/sklearn/compose/_column_transformer.py in _hstack(self, Xs) 614 force_all_finite=False) --> 615 for X in Xs] 616 except ValueError: 5 frames ValueError: could not convert string to float: 'Vikings' During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/sklearn/compose/_column_transformer.py in _hstack(self, Xs) 615 for X in Xs] 616 except ValueError: --> 617 raise ValueError("For a sparse output, all columns should" 618 " be a numeric or convertible to a numeric.") 619
ValueError: For a sparse output, all columns should be a numeric or convertible to a numeric.*
When I tried to encode the 3 columns it send me the result in tuples, but I need it encoded and not in tuples.
(0, 25) 1.0 (0, 62) 1.0 (0, 86) 1.0 (1, 3) 1.0 (1, 44) 1.0 (1, 99) 1.0...
My code is as follows
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
ds = pd.read_csv('nfl_per.csv')
X = ds.iloc[0:2789,4:-1].values
y = ds.iloc[0:2789,-1].values
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [0])], remainder='passthrough')
X = np.array(ct.fit_transform(X))
print(X)
For encoding the 3 colums I use:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
ds = pd.read_csv('nfl_per.csv')
X = ds.iloc[0:2789,4:-1].values
y = ds.iloc[0:2789,-1].values
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [0,1,2])], remainder='passthrough')
X = np.array(ct.fit_transform(X))
print(X)
but again, I don't want it in tuples but encoded.
The dataset that I am using is the following: https://drive.google.com/file/d/1wn5coKQ5BRWS1Bll5po2H45unWtPLqTX/view?usp=sharing
I will appreciatte any guidence and suggestion.
Upvotes: 8
Views: 9400
Reputation: 5059
Try:
OneHotEncoder(sparse=False)
The above was deprecated since version 1.2 and is removed in 1.4 use sparse_output
instead.
OneHotEncoder(sparse_output=False)
Upvotes: 12