Reputation: 47
Consider a 4d array:
x = [[[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
[[17 18 19 20]
[21 22 23 24]
[25 26 27 28]
[29 30 31 32]]]]
I want to slide a (2x2) filter through both of these channels, and flatten each window and concatenate. The expected output is:
[[[ 1. 2. 3. 5. 6. 7. 9. 10. 11.]
[ 2. 3. 4. 6. 7. 8. 10. 11. 12.]
[ 5. 6. 7. 9. 10. 11. 13. 14. 15.]
[ 6. 7. 8. 10. 11. 12. 14. 15. 16.]
[17. 18. 19. 21. 22. 23. 25. 26. 27.]
[18. 19. 20. 22. 23. 24. 26. 27. 28.]
[21. 22. 23. 25. 26. 27. 29. 30. 31.]
[22. 23. 24. 26. 27. 28. 30. 31. 32.]]]
Although i have shown these channels separately, they are part of the same array. My question is, is this achievable solely through the strides utility as_strided. Heres what i have tried.
import numpy as np
def getWindows(input: np.ndarray, kernel_size: int):
batch_str, channel_str, kern_h_str, kern_w_str = input.strides
out_size = input.shape[2] - kernel_size + 1
return (
np.lib.stride_tricks.as_strided(
input,
(input.shape[0], input.shape[1], out_size, out_size, kernel_size, kernel_size),
(batch_str, channel_str, kern_h_str, kern_w_str, kern_h_str, kern_w_str)
)
.reshape(
input.shape[0],
input.shape[1],
out_size * out_size,
kernel_size * kernel_size
)
.swapaxes(2, 3)
.reshape(
input.shape[0],
input.shape[1] * kernel_size * kernel_size,
out_size * out_size
)
)
x = np.arange(1, 33).reshape(1, 2, 4, 4)
print(x)
print(getWindows(x, kernel_size=2))
Although this solution works, its extremely slow and ugly.
Upvotes: 1
Views: 547
Reputation: 231530
A more direct way - still makes a copy. sliding_window
still uses as_strided
, but in a standardized, safer manner.
In [250]: y = np.lib.stride_tricks.sliding_window_view(x,(1,1,3,3))
In [251]: y.shape
Out[251]: (1, 2, 2, 2, 1, 1, 3, 3)
In [252]: y.reshape(2,4,9)
Out[252]:
array([[[ 1, 2, 3, 5, 6, 7, 9, 10, 11],
[ 2, 3, 4, 6, 7, 8, 10, 11, 12],
[ 5, 6, 7, 9, 10, 11, 13, 14, 15],
[ 6, 7, 8, 10, 11, 12, 14, 15, 16]],
[[17, 18, 19, 21, 22, 23, 25, 26, 27],
[18, 19, 20, 22, 23, 24, 26, 27, 28],
[21, 22, 23, 25, 26, 27, 29, 30, 31],
[22, 23, 24, 26, 27, 28, 30, 31, 32]]])
timing:
In [254]: timeit y = np.lib.stride_tricks.sliding_window_view(x,(1,1,3,3)).reshape(2,4,9)
54.5 µs ± 135 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [257]: timeit getWindows(x, kernel_size=2)
21.8 µs ± 30.7 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
If you don't need overlaps, you can skip the strided
and directly use reshape and transposes:
In [264]: x.reshape(1,2,2,2,2,2).transpose(0,1,2,4,3,5).reshape(1,2,4,4)
Out[264]:
array([[[[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]],
[[17, 18, 21, 22],
[19, 20, 23, 24],
[25, 26, 29, 30],
[27, 28, 31, 32]]]])
In [265]: timeit x.reshape(1,2,2,2,2,2).transpose(0,1,2,4,3,5).reshape(1,2,4,4)
2.67 µs ± 30 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
striding
, even though it returns a view
, is not super fast. I haven't tracked down why, but here's the sliding
without the copying:
In [266]: timeit y = np.lib.stride_tricks.sliding_window_view(x,(1,1,3,3))
49.1 µs ± 41.4 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Michael's version (here `x1=x.astype('int64'):
In [271]: timeit np.lib.stride_tricks.as_strided(x1, (2, 2, 2, 3, 3), strides=(128, 32, 8, 32, 8)).reshape(1, -1, 9)
16.5 µs ± 30.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Upvotes: 1