Florida Man
Florida Man

Reputation: 2147

Find indices in an array that contain one of values from another array

How to get the index of values in an array (a) by a another array (label) with more than one "markers"? For example, given

label = array([1, 2])
a = array([1, 1, 2, 2, 3, 3])

the goal is to find the indices of a with the value of 1 or 2; that is, 0, 1, 2, 3.

I tried several combinations. None of the following seems to work.

label = array([1, 2])
a = array([1, 1, 2, 2, 3, 3])
idx = where(a==label)  # gives me only the index of the last value in label
idx = where(a==label[0] or label[1])  # Is confused by all or any?
idx = where(a==label[0] | label[1])   # gives me results as if nor. idx = [4,5] 
idx = where(a==label[0] || label[1])  # syntax error
idx = where(a==bolean.or(label,0,1)   # I know, this is not the correct form but I don`t remember it correctly but remember the error: also asks for a.all or a.any
idx = where(label[0] or label[1] in a)         # gives me only the first appearance. index = 0. Also without where().
idx = where(a==label[0] or a==label[1]).all()) # syntax error
idx = where(a.any(0,label[0] or label[1]))  # gives me only the first appearance. index=0. Also without where().
idx = where(a.any(0,label[0] | label[1]))  # gives me only the first appearance. index=0. Also without where().
idx=where(a.any(0,label))  # Datatype not understood

Ok, I think you get my problem. Does anyone know how to do it correctly? Best would be a solution with a general label instead of label[x] so that the use of label is more variable for later changes.

Upvotes: 0

Views: 1477

Answers (4)

it's-yer-boy-chet
it's-yer-boy-chet

Reputation: 2007

I think what I'm reading as your intent is to get the indices in the second list, 'a', of the values in the first list, 'labels'. I think that a dictionary is a good way to store this information where the labels will be keys and indices will be the values.

Try this:

    labels = [a,2]
    a = [1,1,2,2,3,3]
    results = {}
    for label in labels:
        results[label] = [i for i,x in enumerate(a) if x == label]

if you want the indices of 1 just call results[1]. The list comprehension is and the enumerate function are the real MVPs here.

Upvotes: 0

hpaulj
hpaulj

Reputation: 231325

np.where(a==label) is the same as np.nonzeros(a==label). It tells us the coordinates (indexes) of all non-zero (or True) elements in the array, a==label.

So instead of trying all these different where expressions, focus on the conditional array

Without the where here's what some of your expressions produce:

In [40]: a==label  # 2 arrays don't match in size, scalar False   
Out[40]: False  

In [41]: a==label[0]   # result is the size of a
Out[41]: array([ True,  True, False, False, False, False], dtype=bool)

In [42]: a==label[0] or label[1]  # or is a Python scalar operation
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [43]: a==label[0] | label[1]
Out[43]: array([False, False, False, False,  True,  True], dtype=bool)

This last is the same as a==(label[0] | label[1]), the | is evaluated before the ==.

You need to understand how each of those arrays (or scalar or error) are produced before you understand what where gives you.

Correct combination of 2 equality tests (the extra () are important):

In [44]: (a==label[1]) | (a==label[0])
Out[44]: array([ True,  True,  True,  True, False, False], dtype=bool)

Using broadcasting to separately test the 2 elements of label. Result is 2d array:

In [45]: a==label[:,None]
Out[45]: 
array([[ True,  True, False, False, False, False],
       [False, False,  True,  True, False, False]], dtype=bool)

In [47]: (a==label[:,None]).any(axis=0)
Out[47]: array([ True,  True,  True,  True, False, False], dtype=bool)

Upvotes: 1

Alok--
Alok--

Reputation: 724

You can use numpy.in1d:

>>> a = numpy.array([1, 1, 2, 2, 3, 3])
>>> label = numpy.array([1, 2])
>>> numpy.in1d(a, label)
array([ True,  True,  True,  True, False, False], dtype=bool)

The above returns a mask. If you want indices, you can call numpy.nonzero on the mask array.

Also, if the values in label array are unique, you can pass assume_unique=True to in1d to possibly speed it up.

Upvotes: 4

Ran Cui
Ran Cui

Reputation: 71

As I understand it, you want the indices of 1 and 2 in array "a".

In that case, try

label= [1,2] 
a= [1,1,2,2,3,3]

idx_list = list()
for x in label:
    for i in range(0,len(a)-1):
        if a[i] == x:
            idx_list.append(i)

Upvotes: 0

Related Questions