0x90
0x90

Reputation: 41002

How to get chunks of submatrices faster?

I have a really big matrix (nxn)for which I would to build the intersecting tiles (submatrices) with the dimensions mxm. There will be an offset of step bvetween each contiguous submatrices. Here is an example for n=8, m=4, step=2:

import numpy as np
matrix=np.random.randn(8,8)
n=matrix.shape[0]
m=4
step=2

This will store all the corner indices (x,y) from which we will take a 4x4 natrix: (x:x+4,x:x+4)

a={(i,j) for i in range(0,n-m+1,step) for j in range(0,n-m+1,step)}

The submatrices will be extracted like that

sub_matrices = np.zeros([m,m,len(a)])
for i,ind in enumerate(a):
    x,y=ind
    sub_matrices[:,:,i]=matrix[x:x+m, y:y+m]

Is there a faster way to do this submatrices initialization?

Upvotes: 2

Views: 143

Answers (2)

Divakar
Divakar

Reputation: 221614

We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to get sliding windows. More info on use of as_strided based view_as_windows.

from skimage.util.shape import view_as_windows   

# Get indices as array 
ar = np.array(list(a))

# Get all sliding windows
w = view_as_windows(matrix,(m,m))

# Get selective ones by indexing with ar
selected_windows = np.moveaxis(w[ar[:,0],ar[:,1]],0,2)

Alternatively, we can extract the row and col indices with a list comprehension and then index with those, like so -

R = [i[0] for i in a]
C = [i[1] for i in a]
selected_windows = np.moveaxis(w[R,C],0,2)

Optimizing from the start, we can skip the creation of stepping array, a and simply use the step arg with view_as_windows, like so -

view_as_windows(matrix,(m,m),step=2)

This would give us a 4D array and indexing into the first two axes of it would have all the mxm shaped windows. These windows are simply views into input and hence no extra memory overhead plus virtually free runtime!

Upvotes: 3

milkice
milkice

Reputation: 515

import numpy as np

a = np.random.randn(n, n)

b = a[0:m*step:step, 0:m*step:step]

If you have a one-dimension array, you can get it's submatrix by the following code:

c = a[start:end:step]

If the dimension is two or more, add comma between every dimension.

d = a[start1:end1:step1, start2:end3:step2]

Upvotes: 1

Related Questions