Reputation: 227
I want to divide each row of the csr_matrix by the number of non zero entries in that row.
For example : Consider a csr_matrix A:
A = [[6, 0, 0, 4, 0], [3, 18, 0, 9, 0]]
Result = [[3, 0, 0, 2, 0], [1, 6, 0, 3, 0]]
What's the shortest and efficient way to do it ?
Upvotes: 1
Views: 990
Reputation: 7994
Divakar gives an in-place method. My trial creates a new array.
from scipy import sparse
A = sparse.csr_matrix([[6, 0, 0, 4, 0], [3, 18, 0, 9, 0]])
A.multiply(1.0/(A != 0).sum(axis=1))
We multiply the inverse values of the sum of non-zero parts in each row. Note that one may want to make sure there is no dividing-by-zero errors.
As Divakar pointed out: 1.0
, instead of 1
, is needed at A.multiply(1.0/...)
to be compatible with Python 2.
Upvotes: 2
Reputation: 221564
Get the counts with getnnz
method and then replicate and divide in-place into its flattened view obtained with data
method -
s = A.getnnz(axis=1)
A.data /= np.repeat(s, s)
Inspired by Row Division in Scipy Sparse Matrix 's solution post : Approach #2
.
Sample run -
In [15]: from scipy.sparse import csr_matrix
In [16]: A = csr_matrix([[6, 0, 0, 4, 0], [3, 18, 0, 9, 0]])
In [18]: s = A.getnnz(axis=1)
...: A.data /= np.repeat(s, s)
In [19]: A.toarray()
Out[19]:
array([[3, 0, 0, 2, 0],
[1, 6, 0, 3, 0]])
Note: To be compatible between Python2 and 3, we might want to use //
-
A.data //= ...
Upvotes: 6