Eric
Eric

Reputation: 97601

Default value when indexing outside of a numpy array, even with non-trivial indexing

Is it possible to look up entries from an nd array without throwing an IndexError?

I'm hoping for something like:

>>> a = np.arange(10) * 2
>>> a[[-4, 2, 8, 12]]
IndexError
>>> wrap(a, default=-1)[[-4, 2, 8, 12]]
[-1, 4, 16, -1]

>>> wrap(a, default=-1)[200]
-1

Or possibly more like get_with_default(a, [-4, 2, 8, 12], default=-1)

Is there some builtin way to do this? Can I ask numpy not to throw the exception and return garbage, which I can then replace with my default value?

Upvotes: 12

Views: 3080

Answers (3)

jgoodman
jgoodman

Reputation: 19

This is my first post on any stack exchange site so forgive me for any stylistic errors (hopefully there are only stylistic errors). I am interested in the same feature but could not find anything from numpy better than np.take mentioned by hpaulj. Still np.take doesn't do exactly what's needed. Alfe's answer works but would need some elaboration in order to handle n-dimensional inputs. The following is another workaround that generalizes to the n-dimensional case. The basic idea is similar the one used by Alfe: create a new index with the out of bounds indices masked out (in my case) or disguised (in Alfe's case) and use it to index the input array without raising an error.

def take(a,indices,default=0):
    #initialize mask; will broadcast to length of indices[0] in first iteration
    mask = True
    for i,ind in enumerate(indices):
        #each element of the mask is only True if all indices at that position are in bounds 
        mask = mask & (0 <= ind) & (ind < a.shape[i])
    #create in_bound indices
    in_bound = [ind[mask] for ind in indices]
    #initialize result with default value
    result = default * np.ones(len(mask),dtype=a.dtype)
    #set elements indexed by in_bound to their appropriate values in a
    result[mask] = a[tuple(in_bound)]
    return result

And here is the output from Eric's sample problem:

>>> a = np.arange(10)*2
>>> indices = (np.array([-4,2,8,12]),)
>>> take(a,indices,default=-1)
array([-1,  4, 16, -1])

Upvotes: 1

Alfe
Alfe

Reputation: 59476

You can restrict the range of the indexes to the size of your value array you want to index in using np.maximum() and np.minimum().

Example:

I have a heatmap like

h = np.array([[ 2,  3,  1],
              [ 3, -1,  5]])

and I have a palette of RGB values I want to use to color the heatmap. The palette only names colors for the values 0..4:

p = np.array([[0, 0, 0],  # black
              [0, 0, 1],  # blue
              [1, 0, 1],  # purple
              [1, 1, 0],  # yellow
              [1, 1, 1]]) # white

Now I want to color my heatmap using the palette:

p[h]

Currently this leads to an error because of the values -1 and 5 in the heatmap:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: index 5 is out of bounds for axis 0 with size 5

But I can limit the range of the heatmap:

p[np.maximum(np.minimum(h, 4), 0)]

This works and gives me the result:

array([[[1, 0, 1],
        [1, 1, 0],
        [0, 0, 1]],

       [[1, 1, 0],
        [0, 0, 0],
        [1, 1, 1]]])

If you really need to have a special value for the indexes which are out of bound, you could implement your proposed get_with_default() like this:

def get_with_default(values, indexes, default=-1):
    return np.concatenate([[default], values, [default]])[
        np.maximum(np.minimum(indexes, len(values)), -1) + 1]

a = np.arange(10) * 2
get_with_default(a, [-4, 2, 8, 12], default=-1)

Will return:

array([-1,  4, 16, -1])

as wanted.

Upvotes: 0

hpaulj
hpaulj

Reputation: 231425

np.take with clip mode, sort of does this

In [155]: a
Out[155]: array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [156]: a.take([-4,2,8,12],mode='raise')
...
IndexError: index 12 is out of bounds for size 10

In [157]: a.take([-4,2,8,12],mode='wrap')
Out[157]: array([12,  4, 16,  4])

In [158]: a.take([-4,2,8,12],mode='clip')
Out[158]: array([ 0,  4, 16, 18])

Except you don't have much control over the return value - here indexing on 12 return 18, the last value. And treated the -4 as out of bounds in the other direction, returning 0.

One way of adding the defaults is to pad a first

In [174]: a = np.arange(10) * 2
In [175]: ind=np.array([-4,2,8,12])

In [176]: np.pad(a, [1,1], 'constant', constant_values=-1).take(ind+1, mode='clip')
Out[176]: array([-1,  4, 16, -1])

Not exactly pretty, but a start.

Upvotes: 9

Related Questions