Luca
Luca

Reputation: 10996

collapsing all dimensions of numpy array except the first two

I have a variable dimension numpy array, for example it could have the following shapes

(64, 64)
(64, 64, 2, 5)
(64, 64, 40)
(64, 64, 10, 20, 4)

What I want to do is that if the number of dimensions is greater than 3, I want to collapse/stack everything else into the third dimension while preserving order. So, in my above example the shapes after the operation should be:

(64, 64)
(64, 64, 10)
(64, 64, 40)
(64, 64, 800)

Also, the order needs to be preserved. For example, the array of the shape (64, 64, 2, 5) should be stacked as

(64, 64, 2)
(64, 64, 2)
(64, 64, 2)
(64, 64, 2)
(64, 64, 2)

i.e. the 3D slices one after the other. Also, after the operation I would like to reshape it back to the original shape without any permutation i.e. preserve the original order.

One way I could do is multiply all the dimension values from 3 to the last dimension i.e.

shape = array.shape
if len(shape) > 3:
    final_dim = 1
    for i in range(2, len(shape)):
        final_dim *= shape[i]

and then reshape the array. Something like:

array.reshape(64, 64, final_dim)

However, I was first of all not sure if the order is preserved as I want and whether there is a better pythonic way to achieve this?

Upvotes: 4

Views: 5532

Answers (4)

Martin Krämer
Martin Krämer

Reputation: 577

EDIT: As pointed out in the other answers it is even easier to just provide -1 as the third dimension for reshape. Numpy automatically determines the correct shape then.

I am not sure what the problem here is. You can just use np.reshape and it preserves order. See the following code:

import numpy as np

A = np.random.rand(20,20,2,2,18,5)
print A.shape

new_dim = np.prod(A.shape[2:])
print new_dim
B = np.reshape(A, (A.shape[0], A.shape[1], np.prod(A.shape[2:])))
print B.shape

C = B.reshape((20,20,2,2,18,5))
print np.array_equal(A,C)

The output is:

(20L, 20L, 2L, 2L, 18L, 5L)
360
(20L, 20L, 360L)
True

This accomplishes exactly what you asked for.

Upvotes: 5

hpaulj
hpaulj

Reputation: 231540

I'll try to illustrate the concern that @Divaker brings up.

In [522]: arr = np.arange(2*2*3*4).reshape(2,2,3,4)
In [523]: arr
Out[523]: 
array([[[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],

        [[12, 13, 14, 15],
         [16, 17, 18, 19],
         [20, 21, 22, 23]]],


       [[[24, 25, 26, 27],
         [28, 29, 30, 31],
         [32, 33, 34, 35]],

        [[36, 37, 38, 39],
         [40, 41, 42, 43],
         [44, 45, 46, 47]]]])

4 is the inner most dimension, so it displays the array as 3x4 blocks. And if you pay attention to spaces and [] you'll see there are 2x2 blocks.

Notice what happens when we use the reshape:

In [524]: arr1 = arr.reshape(2,2,-1)
In [525]: arr1
Out[525]: 
array([[[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
        [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]],

       [[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
        [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]]])

Now it is 2 2x12 blocks. You can do anything to those 12 element rows, and reshape them back to 3x4 blocks

In [526]: arr1.reshape(2,2,3,4)
Out[526]: 
array([[[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
      ...

But I could also split this array on the last dimension. np.split can do it, but a list comprehension is easier to understand:

In [527]: alist = [arr[...,i] for i in range(4)]
In [528]: alist
Out[528]: 
[array([[[ 0,  4,  8],
         [12, 16, 20]],

        [[24, 28, 32],
         [36, 40, 44]]]), 
 array([[[ 1,  5,  9],
         [13, 17, 21]],

        [[25, 29, 33],
         [37, 41, 45]]]), 
 array([[[ 2,  6, 10],
         [14, 18, 22]],

        [[26, 30, 34],
         [38, 42, 46]]]), 
 array([[[ 3,  7, 11],
         [15, 19, 23]],

        [[27, 31, 35],
         [39, 43, 47]]])]

This contains 4 (2,2,3) arrays. Note that the 3 element rows display as columns in the 4d display.

I can reform into a 4d array with np.stack (which is like np.array, but gives more control of how the arrays are joined):

In [529]: np.stack(alist, axis=-1)
Out[529]: 
array([[[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],
         ...
        [[36, 37, 38, 39],
         [40, 41, 42, 43],
         [44, 45, 46, 47]]]])

==========

The split equivalent is [x[...,0] for x in np.split(arr, 4, axis=-1)]. Without the indexing split produces (2, 2, 3, 1) arrays.

collapse_dims produces (for my example):

In [532]: np.rollaxis(arr,-1,2).reshape(arr.shape[0],arr.shape[1],-1)
Out[532]: 
array([[[ 0,  4,  8,  1,  5,  9,  2,  6, 10,  3,  7, 11],
        [12, 16, 20, 13, 17, 21, 14, 18, 22, 15, 19, 23]],

       [[24, 28, 32, 25, 29, 33, 26, 30, 34, 27, 31, 35],
        [36, 40, 44, 37, 41, 45, 38, 42, 46, 39, 43, 47]]])

A (2,2,12) array, but with the elements in rows in a different order. It does a transpose on the inner 2 dimensions before flattening.

In [535]: arr[0,0,:,:].ravel()
Out[535]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
In [536]: arr[0,0,:,:].T.ravel()
Out[536]: array([ 0,  4,  8,  1,  5,  9,  2,  6, 10,  3,  7, 11])

Restoring that back to the original order requires another roll or transpose

In [542]: arr2.reshape(2,2,4,3).transpose(0,1,3,2)
Out[542]: 
array([[[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],

      ....

        [[36, 37, 38, 39],
         [40, 41, 42, 43],
         [44, 45, 46, 47]]]])

Upvotes: 0

Divakar
Divakar

Reputation: 221624

Going by the requirement of stacking for the given (64, 64, 2, 5) sample, I think you need to permute the axes. For the permuting, we can use np.rollaxis, like so -

def collapse_dims(a):
    if a.ndim>3:
        return np.rollaxis(a,-1,2).reshape(a.shape[0],a.shape[1],-1)
    else:
        return a

Sample run on the given four sample shapes -

1) Sample shapes :

In [234]: shp1 = (64, 64)
     ...: shp2 = (64, 64, 2, 5)
     ...: shp3 = (64, 64, 40)
     ...: shp4 = (64, 64, 10, 20, 4)
     ...: 

Case #1 :

In [235]: a = np.random.randint(11,99,(shp1))

In [236]: np.allclose(a, collapse_dims(a))
Out[236]: True

Case #2 :

In [237]: a = np.random.randint(11,99,(shp2))

In [238]: np.allclose(a[:,:,:,0], collapse_dims(a)[:,:,0:2])
Out[238]: True

In [239]: np.allclose(a[:,:,:,1], collapse_dims(a)[:,:,2:4])
Out[239]: True

In [240]: np.allclose(a[:,:,:,2], collapse_dims(a)[:,:,4:6]) # .. so on
Out[240]: True

Case #3 :

In [241]: a = np.random.randint(11,99,(shp3))

In [242]: np.allclose(a, collapse_dims(a))
Out[242]: True

Case #4 :

In [243]: a = np.random.randint(11,99,(shp4))

In [244]: np.allclose(a[:,:,:,:,0].ravel(), collapse_dims(a)[:,:,:200].ravel())
Out[244]: True

In [245]: np.allclose(a[:,:,:,:,1].ravel(), collapse_dims(a)[:,:,200:400].ravel())
Out[245]: True

Upvotes: 0

B. M.
B. M.

Reputation: 18668

reshape accept automatic re-dimension :

a=rand(20,20,8,6,4)
s=a.shape[:2]
if a.ndim>2 : s = s+ (-1,)
b=a.reshape(s)

Upvotes: 2

Related Questions