Reputation: 1107
I come to a problem like this:
suppose I have arrays like this:
a = np.array([[1,2,3,4,5,4,3,2,1],])
label = np.array([[1,0,1,0,0,1,1,0,1],])
I need to obtain the indices of a
at which position the element value of label
is 1 and the value of a
is the largest amount all that causing label
to be 1.
It maybe confusing, in the above example, the indices where label
is 1 are: 0, 2, 5, 6, 8, their corresponding values of a
are thus: 1, 3, 4, 3, 1, among which 4 is the larges, thus I need to get the result of 5 which is the index of number 4 in a
. How could I do this with numpy ?
Upvotes: 3
Views: 655
Reputation: 221684
Get the 1s
indices say as idx
, then index into a
with it, get max
index and finally trace it back to the original order by indexing into idx
-
idx = np.flatnonzero(label==1)
out = idx[a[idx].argmax()]
Sample run -
# Assuming inputs to be 1D
In [18]: a
Out[18]: array([1, 2, 3, 4, 5, 4, 3, 2, 1])
In [19]: label
Out[19]: array([1, 0, 1, 0, 0, 1, 1, 0, 1])
In [20]: idx = np.flatnonzero(label==1)
In [21]: idx[a[idx].argmax()]
Out[21]: 5
For a
as ints and label
as an array of 0s
and 1s
, we could optimize further as we could scale a
based on the range of values in it, like so -
(label*(a.max()-a.min()+1) + a).argmax()
Furthermore, if a
has positive numbers only, it would simplify to -
(label*(a.max()+1) + a).argmax()
Timings for positive ints largish a
-
In [115]: np.random.seed(0)
...: a = np.random.randint(0,10,(100000))
...: label = np.random.randint(0,2,(100000))
In [117]: %%timeit
...: idx = np.flatnonzero(label==1)
...: out = idx[a[idx].argmax()]
1000 loops, best of 3: 592 µs per loop
In [116]: %timeit (label*(a.max()-a.min()+1) + a).argmax()
1000 loops, best of 3: 357 µs per loop
# @coldspeed's soln
In [120]: %timeit np.ma.masked_where(~label.astype(bool), a).argmax()
1000 loops, best of 3: 1.63 ms per loop
# won't work with negative numbers in a
In [119]: %timeit (label*(a.max()+1) + a).argmax()
1000 loops, best of 3: 292 µs per loop
# @klim's soln (won't work with negative numbers in a)
In [121]: %timeit np.argmax(a * (label == 1))
1000 loops, best of 3: 229 µs per loop
Upvotes: 3
Reputation: 1269
Here is one of the simplest ways.
>>> np.argmax(a * (label == 1))
5
>>> np.argmax(a * (label == 1), axis=1)
array([5])
Coldspeed's method may take more time.
Upvotes: 1
Reputation: 403020
You can use masked arrays:
>>> np.ma.masked_where(~label.astype(bool), a).argmax()
5
Upvotes: 1