Samufi
Samufi

Reputation: 2710

Change n-th entry of NumPy array that fulfills condition

I have a NumPy array arr and an (inverse) mask mask. For simplicity let us assume they are both 1d. I want to change the nth non-masked value in arr.

An example:

import numpy as np
arr = np.arange(5)
mask = np.array((True, False, True, True, False))

Unfortunately,

arr[mask][-1] = 100

which I expected to return

array([0, 1, 2, 100, 4])

does not work due to the reasons outlined in NumPy array views on non-consecutive items.

A workaround would be to store the allowed values in a new variable, change the respective value, and copy all values back into the original array:

tmp = arr[mask]
tmp[-1] = 100
arr[mask] = tmp

However, this solution is ugly and inefficient, since I have to copy many values that I do not want to change at all.

Does anyone have an elegant way to deal with this kind of problem? I would be interested in a maximally general solution, so that I could do all classic assignment operations with tmp. However, if there is an efficient way that works only for the concrete dscribed case, I would still be interested in it!

Upvotes: 2

Views: 282

Answers (2)

Anton Protopopov
Anton Protopopov

Reputation: 31692

You also could use np.nonzero to get index from your mask?

index = mask.nonzero()[0][-1]

arr[index] = 100

In [29]: arr 
Out[29]: array([  0,   1,   2, 100,   4])

alternatively you could convert you np.array to list and use index method of the list to find index of the last value:

index = arr.tolist().index(arr[mask][-1])
arr[index] = 100

In [78]: arr
Out[78]: array([  0,   1,   2, 100,   4])

Benchmarking:

In [87]: %timeit arr[mask.nonzero()[0][-1]] = 100
1000000 loops, best of 3: 897 ns per loop

In [88]: %timeit arr[np.where(mask)[0][-1]] = 100
1000000 loops, best of 3: 980 ns per loop

In [91]: %timeit arr[arr.tolist().index(arr[mask][-1])] = 100
100000 loops, best of 3: 2.44 us per loop

So nonzero method is a bit faster than np.where.

EDIT

I think nonzero is a bit faster because for np.where from docs:

If only condition is given, return condition.nonzero().

So basically you're callin np.nonzero but through np.where because in that case you passing only a condition.

Upvotes: 0

ali_m
ali_m

Reputation: 74262

One option would be to use np.where to obtain the set of indices where your mask condition is True. You can then index into arr using a subset of these indices and make your assignment:

# np.where returns a tuple of index arrays, one per dimension
arr[np.where(mask)[0][-1]] = 100

print(repr(arr))
# array([  0,   1,   2, 100,   4])

You could combine this approach with slice indexing, boolean indexing etc. For example:

arr[np.where(mask)[0][::-1]] = 100, 200, 300
print(repr(arr))
# array([300,   1, 200, 100,   4])

Upvotes: 2

Related Questions