Reputation: 13778
I am trying to make a machine learning lib work together with scipy sparse matrix.
Below code is to detect if there are more than 1 class in y
or not.Because it doesn't make sense if there is only 1 class when doing classification.
import numpy as np
y = np.array([0,1,0,1,0,1])
uniques = set(y) # get {0, 1}
if len(uniques) == 1:
raise RuntimeError("Only one class detected, aborting...")
But set(y)
not work if y
is scipy sparse matrix.
How to efficiently get all unique value if y
is scipy sparse matrix?
PS: I know set(y.todense())
may work, but is cost too much memory
UPDATE:
>>> y = sp.csr_matrix(np.array([0,1,0,1,0,1]))
>>> set(y.data)
{1}
>>> y.data
array([1, 1, 1])
Upvotes: 2
Views: 2933
Reputation: 231550
Sparse matrices store their values in different ways, but usually there is a .data
attribute that contains the nonzero values.
set(y.data)
might be all that you need. This should work for coo
, csr
, csc
. For others you many need to convert the matrix format (e.g. y.tocoo
).
If that does not work, give us more details on the matrix format and problems.
Upvotes: 5