Reputation: 437
I want to cast a DataFrame to sparse matrix using csr_matrix
from scipy library, but first I have to convert it to a SparseDataFrame. In previous versions of pandas I used pd.SparseDataFrame(df).to_coo()
for such purposes, but since pandas 1.0.0
this method is deprecated. Does anyone know how to perform such conversion using latest pandas api. I used this migration guide and tried various combination but still unable to achieve desired result.
Following the guide, when I do the following
csr_matrix(pd.DataFrame.sparse.from_spmatrix(df).to_coo())
I get this error
AttributeError: 'DataFrame' object has no attribute 'tocsc'
Can anyone help me how to solve this? Also I do find other posts, but couldn't helped me in my case link link link
Upvotes: 1
Views: 3264
Reputation: 2810
IIUC and using the third link you shared, you can convert your df
data to sparse data using pd.SparseDtype
, like this
df_sparsed = df.astype(pd.SparseDtype("float", np.nan)
You can read more about pd.SparseDtype
here to choose right parameters for your data and then use it in your above command like this:
csr_matrix(df_sparsed.sparse.to_coo()) # Note you need .sparse accessor to access .to_coo()
Simple one liner will be
csr_matrix(df.astype(pd.SparseDtype("float", np.nan)).sparse.to_coo())
Upvotes: 3