rroowwllaanndd
rroowwllaanndd

Reputation: 3958

NumPy: How to collapse N-dimensional array along a single dimension using argmin/max output?

Is there a straight-forward way to use the output of calling NumPy's argmax or argmin functions on a single dimension of an N-D array to define an index into that array?

This is probably best explained with an example. Consider the following example of a 2D grid of readings of temperature across time:

>>> import numpy as np
>>> times = np.array([0, 20])
>>> temperature_map_t0 = np.array([[10, 12, 14], [23, 40, 50]])
>>> temperature_map_t1 = np.array([[20, 12, 15], [23, 10, 12]])
>>> temperature_map = np.dstack([temperature_map_t0, temperature_map_t1])

and identically shaped N-D array containing a corresponding pressure readings:

>>> pressure_map = np.random.rand(*temperature_map.shape)

We can find the top temperatures at each location:

>>> top_temperatures = temperature_map.max(axis=2)
>>> top_temperatures
array([[20, 12, 15],
       [23, 40, 50]])

and the times at which they occurred:

>>> times = times[temperature_map.argmax(axis=2)]
>>> times
array([[20,  0, 20],
       [ 0,  0,  0]])

But how can we use temperature_map.argmax(axis=2) to find the corresponding pressures?

>>> pressures_at_top_temperatures = pressures[ ???? ]

In other words - what is the indexing syntax to collapse a single dimension of an N-D array using the argmin or argmax indices for that dimension?

Upvotes: 2

Views: 2644

Answers (2)

rroowwllaanndd
rroowwllaanndd

Reputation: 3958

The most straight-forward solution I could think of was to use logical indexing to zero out the entries that are not selected by the desired index, and then to sum over the dimension of interest, e.g. as follows:

def collapse_dimension(ndarr, index, axis):
    r = np.rollaxis(ndarr, axis, 0)
    return np.sum((r[i] * (index == i) for i in range(r.shape[0])), axis=0)

So given the above example, we can use argmax or argmin to collapse the array on any given dimension, e.g.

>>> pressures_at_top_temperatures = collapse_dimension(
...     pressure_map, temperature_map.argmax(axis=2), 2)

and, trivially, get the max across any given dimension using the corresponding argmax:

>>> temperature_map.max(axis=2) == collapse_dimension(
...     temperature_map, temperature_map.argmax(axis=2), 2)
array([[ True,  True,  True],
       [ True,  True,  True]], dtype=bool)

However, I have a strong suspicion there's a nicer way to do this that doesn't involve writing this extra function -- any ideas??

Upvotes: 0

YXD
YXD

Reputation: 32521

Credit to Jaime who answered when I had a similar problem

import numpy as np
times = np.array([0, 20])
temperature_map_t0 = np.array([[10, 12, 14], [23, 40, 50]])
temperature_map_t1 = np.array([[20, 12, 15], [23, 10, 12]])
temperature_map = np.dstack([temperature_map_t0, temperature_map_t1])
top_temperatures = temperature_map.max(axis=2)

# shape is a tuple - no need to convert
# http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.shape.html
pressure_map = np.random.rand(*temperature_map.shape)

idx = temperature_map.argmax(axis=2)

s = temperature_map.shape
result pressure_map[np.arange(s[0])[:, None], np.arange(s[1]), idx]

Upvotes: 2

Related Questions