Amit
Amit

Reputation: 6294

Flattening a numpy array with indexes

Given a NumPy array of shape (X, Y, 2) representing an array of "frames" including "points" and every point having an (x,y) coordinate, I'd like to consolidate the first and second dimensions to an (X*Y, 4) array that now represents all of the points, and indexes of the X and Y dimensions.

For example, If my array is:

[
  [          # Frame 0
   [1, 2],   # Point 0
   [2, 3]    # Point 1
  ],
  [          # Frame 1
   [4, 5],   # Point 0
   [6, 7]    # Point 1
  ]
]

I'd like to get the array:

[
  [0, 0, 1, 2],   # Frame 0, Point 0
  [0, 1, 2, 3]    # Frame 0, Point 1
  [1, 0, 4, 5],   # Frame 1, Point 0
  [1, 1, 6, 7]    # Frame 1, Point 1
]

Slow solution:

arr = np.array([[[1, 2],[2, 3]],[[4, 5],[6, 7]]])
new_arr = []
for i, points in enumerate(arr):
  for j, point in enumerate(points):
    new_arr.append([i, j] + point.tolist())

Is there a faster way?

Upvotes: 1

Views: 746

Answers (3)

alani
alani

Reputation: 13079

A larger example array is used in this code so that it could be tested with different sizes in each dimension:

import numpy as np

arr = np.array(
    [
        [          
            [1, 2],   
            [2, 3],   
            [3, 4]    
            ],
        [          
            [4, 5],   
            [6, 7],    
            [8, 7]    
            ],
        [          
            [14, 5],   
            [16, 7],    
            [18, 7]    
            ],
        [            
            [24, 5],   
            [26, 7],    
            [28, 7]    
            ]
        ]
)

x, y = arr.shape[:2]
assert(arr.shape[2] == 2)
ay, ax = (a.reshape(x, y, 1) for a in np.meshgrid(np.arange(y), np.arange(x)))
new_array = np.concatenate([ax, ay, arr], axis=2).reshape(x * y, 4)

print(repr(new_array))

gives the following:

array([[ 0,  0,  1,  2],
       [ 0,  1,  2,  3],
       [ 0,  2,  3,  4],
       [ 1,  0,  4,  5],
       [ 1,  1,  6,  7],
       [ 1,  2,  8,  7],
       [ 2,  0, 14,  5],
       [ 2,  1, 16,  7],
       [ 2,  2, 18,  7],
       [ 3,  0, 24,  5],
       [ 3,  1, 26,  7],
       [ 3,  2, 28,  7]])

And using your original example array gives:

array([[0, 0, 1, 2],
       [0, 1, 2, 3],
       [1, 0, 4, 5],
       [1, 1, 6, 7]])

There are no explicit loops, so it ought to be faster. (Any looping is inside numpy and will be implemented in optimised C code.)

Upvotes: 1

V. Ayrat
V. Ayrat

Reputation: 2719

You can solve each part separately using numpy.ndindex to get indices and .reshape(). Then you can use numpy.c_ to stack them.

a = np.array([[[1, 2],[2, 3]],[[4, 5],[6, 7]]])
c = a.reshape(-1, a.shape[-1])
print(c)
# [[1 2]
#  [2 3]
#  [4 5]
#  [6 7]]
indices = list(np.ndindex(a.shape[:-1]))
print(indices)
# [(0, 0), (0, 1), (1, 0), (1, 1)]
print(np.c_[indices, c])
# [[0 0 1 2]
#  [0 1 2 3]
#  [1 0 4 5]
#  [1 1 6 7]]

Upvotes: 2

theX
theX

Reputation: 1134

I'm also new to NumPy but I think this should work (someone correct me if I'm wrong): arr.reshape(-1,4)

Upvotes: 0

Related Questions