Looping over large sparse array

Question

Let's say I have a (sparse) matrix M size (N*N, N*N). I want to select elements from this matrix where the outer product of grid (a (n,m) array, where n*m=N) is True (it is a boolean 2D array, and na=grid.sum()). This can be done as follows

result = M[np.outer( grid.flatten(),grid.flatten() )].reshape (( N, N ) )

result is an (na,na) sparse array (and na < N). The previous line is what I want to achieve: get the elements of M that are true from the product of grid, and squeeze the ones that aren't true out of the array.

As n and m (and hence N) grow, and M and result are sparse matrices, I am not able to do this efficiently in terms of memory or speed. Closest I have tried is:

result = sp.lil_matrix ( (1, N*N), dtype=np.float32 )
# Calculate outer product
A = np.einsum("i,j", grid.flatten(), grid.flatten())  
cntr = 0
it = np.nditer ( A, flags=['multi_index'] )
while not it.finished:
    if it[0]:
        result[0,cntr] = M[it.multi_index[0], it.multi_index[1]]
        cntr += 1
# reshape result to be a N*N sparse matrix

The last reshape could be done by this approach, but I haven't got there yet, as the while loop is taking forever.

I have also tried selecting nonzero elements of A too, and looping over but this eats up all of the memory:

A=np.einsum("i,j", grid.flatten(), grid.flatten())  
nzero = A.nonzero() # This eats lots of memory
cntr = 0
for (i,j) in zip (*nzero):
    temp_mat[0,cntr] = M[i,j]
    cnt += 1

'n' and 'm' in the example above are around 300.

Looping over large sparse array

Answers (1)

Related Questions