Get indices and values of an ndarray in NumPy

Question

I have a ndarray A of arbitrary number of dimensions N. I want to create an array B of tuples (array, or lists) where the first N elements in each tuple are the index and the last element is the value of that index in A.

For example:

A = array([[1, 2, 3], [4, 5, 6]])

Then

B = [(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

What is best/fastest way to do this in NumPy without for loops?

MSeifert · Accepted Answer

If you have Python 3 a very simple (and moderately fast) way would be (using np.ndenumerate):

>>> import numpy as np
>>> A = np.array([[1, 2, 3], [4, 5, 6]])
>>> [(*idx, val) for idx, val in np.ndenumerate(A)]
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

It would be a bit different if you want it to work for both Python 3 and Python 2, because Python 2 doesn't allow iterable unpacking inside a tuple literal. But you could use tuple concatenation (addition):

>>> [idx + (val,) for idx, val in np.ndenumerate(A)]
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

If you want to completely stay within NumPy it would be better to create the indices with np.mgrid:

>>> grid = np.mgrid[:A.shape[0], :A.shape[1]]  # indices!
>>> np.stack([grid[0], grid[1], A]).reshape(3, -1).T
array([[0, 0, 1],
       [0, 1, 2],
       [0, 2, 3],
       [1, 0, 4],
       [1, 1, 5],
       [1, 2, 6]])

However that would require a loop to convert it to a list of tuples... But it would be easy to convert it to a list of list:

>>> np.stack([grid[0], grid[1], A]).reshape(3, -1).T.tolist()
[[0, 0, 1], [0, 1, 2], [0, 2, 3], [1, 0, 4], [1, 1, 5], [1, 2, 6]]

The list of tuples is also possible without visible for-loop:

>>> list(map(tuple, np.stack([grid[0], grid[1], A]).reshape(3, -1).T.tolist()))
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]

Even though there is no visible for-loop the tolist, list, tuple and the map do hide a for-loop in the Python layer.

For arbitary dimensional arrays you need to change the latter approach a bit:

coords = tuple(map(slice, A.shape))
grid = np.mgrid[coords]

# array version
np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T
# list of list version
np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T.tolist()
# list of tuple version
list(map(tuple, np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T.tolist()))

The ndenumerate approach would work for arrays of any dimensions without change and according to my timings only be 2-3 times slower.

Get indices and values of an ndarray in NumPy

Answers (2)

Related Questions