muammar
muammar

Reputation: 967

Find indices of N concurrences in arrays using numpy

I have an array that I obtain from using sp.distance.cdist, and such array looks as follows:

 [ 0.          5.37060126  2.68530063  4.65107712  2.68530063  4.65107712
   2.04846297  7.41906423  4.11190697  6.50622284  4.11190697  6.50622284]
 [ 5.37060126  0.          4.65107712  2.68530063  4.65107712  2.68530063
   7.41906423  2.04846297  6.50622284  4.11190697  6.50622284  4.11190697]
 [ 2.68530063  4.65107712  0.          2.68530063  4.65107712  5.37060126
   4.11190697  6.50622284  2.04846297  4.11190697  6.50622284  7.41906423]
 [ 4.65107712  2.68530063  2.68530063  0.          5.37060126  4.65107712
   6.50622284  4.11190697  4.11190697  2.04846297  7.41906423  6.50622284]
 [ 2.68530063  4.65107712  4.65107712  5.37060126  0.          2.68530063
   4.11190697  6.50622284  6.50622284  7.41906423  2.04846297  4.11190697]
 [ 4.65107712  2.68530063  5.37060126  4.65107712  2.68530063  0.
   6.50622284  4.11190697  7.41906423  6.50622284  4.11190697  2.04846297]
 [ 2.04846297  7.41906423  4.11190697  6.50622284  4.11190697  6.50622284
   0.          9.4675272   4.7337636   8.19911907  4.7337636   8.19911907]
 [ 7.41906423  2.04846297  6.50622284  4.11190697  6.50622284  4.11190697
   9.4675272   0.          8.19911907  4.7337636   8.19911907  4.7337636 ]
 [ 4.11190697  6.50622284  2.04846297  4.11190697  6.50622284  7.41906423
   4.7337636   8.19911907  0.          4.7337636   8.19911907  9.4675272 ]
 [ 6.50622284  4.11190697  4.11190697  2.04846297  7.41906423  6.50622284
   8.19911907  4.7337636   4.7337636   0.          9.4675272   8.19911907]
 [ 4.11190697  6.50622284  6.50622284  7.41906423  2.04846297  4.11190697
   4.7337636   8.19911907  8.19911907  9.4675272   0.          4.7337636 ]
 [ 6.50622284  4.11190697  7.41906423  6.50622284  4.11190697  2.04846297
   8.19911907  4.7337636   9.4675272   8.19911907  4.7337636   0.        ]]

What I'm trying to do, using numpy, is to search some values, for example between 2.7 and 2.3, and at the same time I'd also like to return the indices when they are found in the rows of the arrays. I have read a lot, and I have found for example .argmin(), which does partially what I want (but it only shows you where the zeros or values lower than zero are located, and just one concurrence). In the documentation of .argmin I cannot find anything related on how find the minimum different from zero and that it doesn't stop after the first concurrence. I need to do it for these values in the interval. To explain myself better, this is what I expect to get:

e.g.:

[row (0), index (2), index (4)]
[row (1), index (3), index (5)]
[row (2), index (0), index (3)]

What would be the best way to do this? In the meantime, I'll keep trying and if I find a solution I'll post it here.

Thanks.

Upvotes: 3

Views: 225

Answers (2)

Acorbe
Acorbe

Reputation: 8391

What you looking for is the np.argwhere function, which tells you index-wise where a condition in an array is satisfied.

v = np.array([[ 0.     ,     5.37060126,  2.68530063 , 4.65107712 , 2.5 ],
              [ 5.37060126 ,  4.65107712 , 2.68530063 ,.11190697,1 ]])


np.argwhere((v > 2.3) & (v < 2.7))

array([[0, 2],
        [0, 4],
         [1, 2]])

Upvotes: 2

Saullo G. P. Castro
Saullo G. P. Castro

Reputation: 58885

What you need is numpy.where, which returns a tuple containing the indices of each dimension where some condition is True for the values of an numpy.ndarray. Example using your data:

i, j = np.where(((a > 2.3) & (a < 2.7)))
#(array([ 0,  0,  2,  2,  4,  4,  6,  6,  8,  8, 10, 10], dtype=int64),
# array([2, 4, 3, 5, 0, 3, 1, 2, 0, 5, 1, 4], dtype=int64))

Then you can use groupby to put the output in the format that you want:

from itertools import groupby
for k,g in itertools.groupby(zip(i, j), lambda x: x[0]):
    print k, [tmp[1] for tmp in zip(*g)]
#0 [0, 4]
#2 [2, 5]
#4 [4, 3]
#6 [6, 2]
#8 [8, 5]
#10 [10, 4]

Upvotes: 1

Related Questions