Sia Rezaei
Sia Rezaei

Reputation: 437

Get indices and values of an ndarray in NumPy

I have a ndarray A of arbitrary number of dimensions N. I want to create an array B of tuples (array, or lists) where the first N elements in each tuple are the index and the last element is the value of that index in A.

For example:

A = array([[1, 2, 3], [4, 5, 6]])

Then

B = [(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

What is best/fastest way to do this in NumPy without for loops?

Upvotes: 3

Views: 1924

Answers (2)

Brad Solomon
Brad Solomon

Reputation: 40878

You can also do this with np.ndindex, although @Mseifert's approach is pretty unbeatable on timing and brevity. The only loop here is to zip up the generator of coordinates with the actual values. (Same as in the other answer.)

def tuple_index(a):
    indices = np.ndindex(*a.shape)
    return [(*i, j) for i, j in zip(indices, a.flatten())]

print(tuple_index(a))
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

Upvotes: 1

MSeifert
MSeifert

Reputation: 152577

If you have Python 3 a very simple (and moderately fast) way would be (using np.ndenumerate):

>>> import numpy as np
>>> A = np.array([[1, 2, 3], [4, 5, 6]])
>>> [(*idx, val) for idx, val in np.ndenumerate(A)]
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

It would be a bit different if you want it to work for both Python 3 and Python 2, because Python 2 doesn't allow iterable unpacking inside a tuple literal. But you could use tuple concatenation (addition):

>>> [idx + (val,) for idx, val in np.ndenumerate(A)]
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

If you want to completely stay within NumPy it would be better to create the indices with np.mgrid:

>>> grid = np.mgrid[:A.shape[0], :A.shape[1]]  # indices!
>>> np.stack([grid[0], grid[1], A]).reshape(3, -1).T
array([[0, 0, 1],
       [0, 1, 2],
       [0, 2, 3],
       [1, 0, 4],
       [1, 1, 5],
       [1, 2, 6]])

However that would require a loop to convert it to a list of tuples... But it would be easy to convert it to a list of list:

>>> np.stack([grid[0], grid[1], A]).reshape(3, -1).T.tolist()
[[0, 0, 1], [0, 1, 2], [0, 2, 3], [1, 0, 4], [1, 1, 5], [1, 2, 6]]

The list of tuples is also possible without visible for-loop:

>>> list(map(tuple, np.stack([grid[0], grid[1], A]).reshape(3, -1).T.tolist()))
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

Even though there is no visible for-loop the tolist, list, tuple and the map do hide a for-loop in the Python layer.


For arbitary dimensional arrays you need to change the latter approach a bit:

coords = tuple(map(slice, A.shape))
grid = np.mgrid[coords]

# array version
np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T
# list of list version
np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T.tolist()
# list of tuple version
list(map(tuple, np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T.tolist()))

The ndenumerate approach would work for arrays of any dimensions without change and according to my timings only be 2-3 times slower.

Upvotes: 3

Related Questions