Reputation: 2129
Let's say I have a (sparse) matrix M
size (N*N, N*N)
. I want to select elements from this matrix where the outer product of grid
(a (n,m)
array, where n*m=N
) is True (it is a boolean 2D array, and na=grid.sum()
). This can be done as follows
result = M[np.outer( grid.flatten(),grid.flatten() )].reshape (( N, N ) )
result
is an (na,na)
sparse array (and na < N
). The previous line is what I want to achieve: get the elements of M that are true from the product of grid
, and squeeze the ones that aren't true out of the array.
As n
and m
(and hence N
) grow, and M
and result
are sparse matrices, I am not able to do this efficiently in terms of memory or speed. Closest I have tried is:
result = sp.lil_matrix ( (1, N*N), dtype=np.float32 )
# Calculate outer product
A = np.einsum("i,j", grid.flatten(), grid.flatten())
cntr = 0
it = np.nditer ( A, flags=['multi_index'] )
while not it.finished:
if it[0]:
result[0,cntr] = M[it.multi_index[0], it.multi_index[1]]
cntr += 1
# reshape result to be a N*N sparse matrix
The last reshape could be done by this approach, but I haven't got there yet, as the while loop is taking forever.
I have also tried selecting nonzero elements of A too, and looping over but this eats up all of the memory:
A=np.einsum("i,j", grid.flatten(), grid.flatten())
nzero = A.nonzero() # This eats lots of memory
cntr = 0
for (i,j) in zip (*nzero):
temp_mat[0,cntr] = M[i,j]
cnt += 1
'n' and 'm' in the example above are around 300.
Upvotes: 0
Views: 321
Reputation: 231385
I don't know if it was a typo, or code error, but your example is missing an iternext
:
R=[]
it = np.nditer ( A, flags=['multi_index'] )
while not it.finished:
if it[0]:
R.append(M[it.multi_index])
it.iternext()
I think appending to a list is simpler and faster than R[ctnr]=...
. It's competitive if R
is a regular array, and sparse indexing is slower (even the fastest lil
format).
ndindex
wraps this use of a nditer as:
R=[]
for index in np.ndindex(A.shape):
if A[index]:
R.append(M[index])
ndenumerate
also works:
R = []
for index,a in np.ndenumerate(A):
if a:
R.append(M[index])
But I wonder if you really want to advance the cntr
each it
step, not just the True
cases. Otherwise reshaping result
to (N,N)
doesn't make much sense. But in that case, isn't your problem just
M[:N, :N].multiply(A)
or if M
was a dense array:
M[:N, :N]*A
In fact if both M
and A
are sparse, then the .data
attribute of that multiply
will be the same as the R
list.
In [76]: N=4
In [77]: M=np.arange(N*N*N*N).reshape(N*N,N*N)
In [80]: a=np.array([0,1,0,1])
In [81]: A=np.einsum('i,j',a,a)
In [82]: A
Out[82]:
array([[0, 0, 0, 0],
[0, 1, 0, 1],
[0, 0, 0, 0],
[0, 1, 0, 1]])
In [83]: M[:N, :N]*A
Out[83]:
array([[ 0, 0, 0, 0],
[ 0, 17, 0, 19],
[ 0, 0, 0, 0],
[ 0, 49, 0, 51]])
In [84]: c=sparse.csr_matrix(M)[:N,:N].multiply(sparse.csr_matrix(A))
In [85]: c.data
Out[85]: array([17, 19, 49, 51], dtype=int32)
In [89]: [M[index] for index, a in np.ndenumerate(A) if a]
Out[89]: [17, 19, 49, 51]
Upvotes: 1