ahoosh
ahoosh

Reputation: 1360

Comparing a numpy array object to multiple conditions

I am trying to use numpy.where to find the indices I want. Here's the code:

import numpy as np
a = np.array([20,58,32,0,107,57]).reshape(2,3)
item_index = np.where((a == 58) | (a == 107) | (a == 20))
print item_index

I get item_index as below:

(array([0, 0, 1]), array([0, 1, 1]))

However, in reality, the dimensions of a is 20000 x 7 and the conditions are several hundred instead of just three. Is there a way to use numpy.where for multiple conditions? I found topics here, here and here useful, but I couldn't find the answer to my question.

Upvotes: 2

Views: 1446

Answers (3)

dawg
dawg

Reputation: 103834

Given (per your example):

>>> a
array([[ 20,  58,  32],
       [  0, 107,  57]])

with the query, 'is an array element of a in a list of values', just use numpy.in1d:

>>> np.in1d(a, [58, 107, 20])
array([ True,  True, False, False,  True, False], dtype=bool)

If you want the indexes to be the same as the underlying array, just reshape to the shape of a:

>>> np.in1d(a, [58, 107, 20]).reshape(a.shape)
array([[ True,  True, False],
       [False,  True, False]], dtype=bool)

Then test against that:

>>> tests=np.in1d(a, [58, 107, 20]).reshape(a.shape)
>>> tests[1,1]                 # is the element of 'a' in the list [58, 107, 20]?
True

In one line (obvious, but I do not know if efficient for one off queries):

>>> np.in1d(a, [58, 107, 20]).reshape(a.shape)[1,1]
True

Upvotes: 3

wwii
wwii

Reputation: 23753

Add another dimension to each so they can be broadcast against each other:

>>> 
>>> a = np.array([20,58,32,0,107,57]).reshape(2,3)
>>> b = np.array([58, 107, 20])
>>> np.any(a[...,np.newaxis] == b[np.newaxis, ...], axis = 2)
array([[ True,  True, False],
       [False,  True, False]], dtype=bool)
>>> 

Upvotes: 2

chrisb
chrisb

Reputation: 52246

Someone better at numpy may have a better solution - but if you have pandas installed you could do something like this.

import pandas as pd
df = pd.DataFrame(a) # Create a pandas dataframe from array

conditions = [58, 107, 20]
item_index = df.isin(conditions).values.nonzero()

isin builds boolean array which is True is the value is in the conditions list. The call to .values extracts the underlying numpy array from the pandas DataFrame. The call to nonzero() converts bools to 1s and 0s.

Upvotes: 2

Related Questions