k200338 Mahad Hameed
k200338 Mahad Hameed

Reputation: 47

Performing im2Col with numpy.lib.stride_tricks.as_strided

Consider a 4d array:

x = [[[[ 1  2  3  4]
       [ 5  6  7  8]
       [ 9 10 11 12]
       [13 14 15 16]]
    
      [[17 18 19 20]
       [21 22 23 24]
       [25 26 27 28]
       [29 30 31 32]]]]

I want to slide a (2x2) filter through both of these channels, and flatten each window and concatenate. The expected output is:

[[[ 1.  2.  3.  5.  6.  7.  9. 10. 11.]
  [ 2.  3.  4.  6.  7.  8. 10. 11. 12.]
  [ 5.  6.  7.  9. 10. 11. 13. 14. 15.]
  [ 6.  7.  8. 10. 11. 12. 14. 15. 16.]

  [17. 18. 19. 21. 22. 23. 25. 26. 27.]
  [18. 19. 20. 22. 23. 24. 26. 27. 28.]
  [21. 22. 23. 25. 26. 27. 29. 30. 31.]
  [22. 23. 24. 26. 27. 28. 30. 31. 32.]]]

Although i have shown these channels separately, they are part of the same array. My question is, is this achievable solely through the strides utility as_strided. Heres what i have tried.

import numpy as np

def getWindows(input: np.ndarray, kernel_size: int):
    batch_str, channel_str, kern_h_str, kern_w_str = input.strides
    out_size =  input.shape[2] - kernel_size + 1
    return (
        np.lib.stride_tricks.as_strided(
            input,
            (input.shape[0], input.shape[1], out_size, out_size, kernel_size, kernel_size),
            (batch_str, channel_str, kern_h_str, kern_w_str, kern_h_str, kern_w_str)
        )
        .reshape(
            input.shape[0], 
            input.shape[1], 
            out_size * out_size, 
            kernel_size * kernel_size
        )
        .swapaxes(2, 3)
        .reshape(
            input.shape[0], 
            input.shape[1] * kernel_size * kernel_size,
            out_size * out_size
        )
    )
x = np.arange(1, 33).reshape(1, 2, 4, 4)
print(x)
print(getWindows(x, kernel_size=2))

Although this solution works, its extremely slow and ugly.

Upvotes: 1

Views: 547

Answers (1)

hpaulj
hpaulj

Reputation: 231530

A more direct way - still makes a copy. sliding_window still uses as_strided, but in a standardized, safer manner.

In [250]: y = np.lib.stride_tricks.sliding_window_view(x,(1,1,3,3))

In [251]: y.shape
Out[251]: (1, 2, 2, 2, 1, 1, 3, 3)

In [252]: y.reshape(2,4,9)
Out[252]: 
array([[[ 1,  2,  3,  5,  6,  7,  9, 10, 11],
        [ 2,  3,  4,  6,  7,  8, 10, 11, 12],
        [ 5,  6,  7,  9, 10, 11, 13, 14, 15],
        [ 6,  7,  8, 10, 11, 12, 14, 15, 16]],

       [[17, 18, 19, 21, 22, 23, 25, 26, 27],
        [18, 19, 20, 22, 23, 24, 26, 27, 28],
        [21, 22, 23, 25, 26, 27, 29, 30, 31],
        [22, 23, 24, 26, 27, 28, 30, 31, 32]]])

timing:

In [254]: timeit y = np.lib.stride_tricks.sliding_window_view(x,(1,1,3,3)).reshape(2,4,9)
54.5 µs ± 135 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [257]: timeit getWindows(x, kernel_size=2)
21.8 µs ± 30.7 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

If you don't need overlaps, you can skip the strided and directly use reshape and transposes:

In [264]: x.reshape(1,2,2,2,2,2).transpose(0,1,2,4,3,5).reshape(1,2,4,4)
Out[264]: 
array([[[[ 1,  2,  5,  6],
         [ 3,  4,  7,  8],
         [ 9, 10, 13, 14],
         [11, 12, 15, 16]],

        [[17, 18, 21, 22],
         [19, 20, 23, 24],
         [25, 26, 29, 30],
         [27, 28, 31, 32]]]])

In [265]: timeit x.reshape(1,2,2,2,2,2).transpose(0,1,2,4,3,5).reshape(1,2,4,4)
2.67 µs ± 30 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

striding, even though it returns a view, is not super fast. I haven't tracked down why, but here's the sliding without the copying:

In [266]: timeit y = np.lib.stride_tricks.sliding_window_view(x,(1,1,3,3))
49.1 µs ± 41.4 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Michael's version (here `x1=x.astype('int64'):

In [271]: timeit np.lib.stride_tricks.as_strided(x1, (2, 2, 2, 3, 3), strides=(128, 32, 8, 32, 8)).reshape(1, -1, 9)
16.5 µs ± 30.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Upvotes: 1

Related Questions