Brian
Brian

Reputation: 745

Access scipy.sparse.csr_matrix, all rows with none zero column j

I hope my question is clear, but let's say I have a sparse matrix like following:

import numpy as np
a = np.eye(5, 5)
a[0,3]=1
a[3,0]=1
a[4,2]=1
a[3,2]=1
a = csr_matrix(a)
[[ 1.  0.  0.  1.  0.]
 [ 0.  1.  0.  0.  0.]
 [ 0.  0.  1.  0.  0.]
 [ 1.  0.  1.  1.  0.]
 [ 0.  0.  1.  0.  1.]]

what I want to get is, for example, all rows with column 2's value to be '1' as a sparse matrix, like:

 (0, 2) 1.0
 (1, 3) 1.0
 (1, 2) 1.0
 (1, 0) 1.0
 (2, 4) 1.0
 (2, 2) 1.0

also, I want to get all rows with column 2's value to be '0' as another sparse matrix, like:

(0, 3)  1.0
(0, 0)  1.0
(1, 1)  1.0

I am not sure if my code is efficient or not, but currently what I did is:

b = np.asarray(a.getcol(2).todense()).reshape(-1)
iPos = np.nonzero(b)[0]
iZero = np.nonzero(np.logical_not(b))[0]
a1 = a[iPos, :]
a0 = a[iZero, :]

So is there any more elegant way to do this? Thanks in advance.

Upvotes: 4

Views: 3583

Answers (1)

lucasg
lucasg

Reputation: 11012

This is one way to do it :

import numpy as np
from scipy.sparse import csr_matrix


a = np.eye(5, 5)
a[0,3]=1
a[3,0]=1
a[4,2]=1
a[3,2]=1


a = csr_matrix(a)
dense = np.asarray(a.todense())
column = np.asarray(a.getcol(2).todense()).reshape(-1)


print "dense"
# operations on full dense matrix
print "1"
print csr_matrix( np.vstack([ line for line in dense if line[2] == 1 ]) )
print "2"
print csr_matrix( np.vstack([ line for line in dense if line[2] == 0 ]) )

print "sparse"
# Operations on sparse matrix
result1 = []
result2 = []
for irow in range(a.shape[0]):
    if column[irow] == 1:
        [ result1.append( (irow,indice) ) for indice in a[irow].indices   ]
    else :
        [ result2.append( (irow,indice) ) for indice in a[irow].indices   ]

print result1,result2

The first method is really compact, but use the full dense input matrix (which can be bothering if you work on big matrices) whereas the second one works only on the sparse matrix, but the result object is a list of tuple, not a scipy.sparse.matrix.

Upvotes: 1

Related Questions