Reputation: 173
below is the code snippet,
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder',OneHotEncoder(),[2,3,4])],remainder='passthrough')
X = np.array(ct.fit_transform(x_data))
X.shape
i get output like below for shape
()
when i try to print X , I get output like below
array(<8820x35 sparse matrix of type '<class 'numpy.float64'>'
with 41527 stored elements in Compressed Sparse Row format>, dtype=object)
now when i try to convert this array to dataframe
X = pd.DataFrame(X)
i get below error
ValueError: Must pass 2-d input
how do i convert my numpy array to dataframe?
Upvotes: 0
Views: 2169
Reputation: 231385
Looks like
ct.fit_transform(x_data)
produces a sparse matrix.
np.array(...)
just wraps that in a object dtype array.
array(<8820x35 sparse matrix of type '<class 'numpy.float64'>'
with 41527 stored elements in Compressed Sparse Row format>, dtype=object)
Use toarray
or A
to convert it properly to a numpy array:
X = ct.fit_transform(x_data).A
Upvotes: 1
Reputation: 203
So first, convert the sparse matrix from csr_matrix to a normal array
X = X.toarray()
df = pd.DataFrame(X)
The above should work
Upvotes: 2