Splitting multidimensional array in Numpy

Question

I'm trying to split a multidimensional array (array)

import numpy as np

shape = (3, 4, 4, 2)
array = np.random.randint(0,10,shape)

into an array (new_array) with shape (3,2,2,2,2,2) where the dimension 1 has been split into 2 (dimension 1 and 2) and dimension 2 in array has been split into 2 (dimensions 3 and 4).

So far I got a working method which is:

div_x = 2
div_y = 2
new_dim_x = shape[1]//div_x
new_dim_y = shape[2]//div_y

new_array_split = np.array([np.split(each_sub, axis=2, indices_or_sections=div_y) for each_sub in np.split(array[:, :(new_dim_x*div_x), :(new_dim_y*div_y)], axis=1, indices_or_sections=div_x)])

I'm also looking into using reshape:

new_array_reshape = array[:, :(div_x*new_dim_x), :(div_y*new_dim_y), ...].reshape(shape[0], div_x, div_y, new_dim_x, new_dim_y, shape[-1]).transpose(1,2,0,3,4,5)

The reshape method is faster than the split method:

%timeit array[:, :(div_x*new_dim_x), :(div_y*new_dim_y), ...].reshape(shape[0], div_x, div_y, new_dim_x, new_dim_y, shape[-1]).transpose(1,2,0,3,4,5)
2.16 µs ± 44.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit np.array([np.split(each_sub, axis=2, indices_or_sections=div_y) for each_sub in np.split(array[:, :(new_dim_x*div_x), :(new_dim_y*div_y)], axis=1, indices_or_sections=div_x)])
58.3 µs ± 2.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

However, I cannot get the same results, because of the last dimension:

print('Reshape method')
print(new_array_reshape[1,0,0,...])
print('
Split method')
print(new_array_split[1,0,0,...])
 
Reshape method
[[[2 2]
  [4 3]]
 [[3 5]
  [5 9]]]

Split method
[[[2 2]
  [4 3]]
 [[5 3]
  [9 8]]]

The split method does exactly what I want, I did check number by number and it does the type of split I want, but not at the speed I would like.

QUESTION

Is there a way to achieve the same results as the split method, using reshape or any other approach?

CONTEXT

The array is actually data flow from image processing, where the first dimension of array is the time, the second dimension is coordinate x (4), the third dimension is coordinate y (4) and the fourth dimension (2) is the Magnitude and phase of the flow.

I would like to split the images (coordinate x and y) into subimages making an array of pictures of 2x2 so I can analyse the flow more locally, perform averages, clustering, etc.

This process (splitting) is going to be performed many times that is why I'm looking for an optimal and efficient solution. I believe the way is probably using reshape, but I'm open to any other option.

Divakar · Accepted Answer

Reshape and permute axes -

array.reshape(3,2,2,2,2,2).transpose(1,3,0,2,4,5)

Splitting multidimensional array in Numpy

Answers (2)

Related Questions