Yly
Yly

Reputation: 2310

Checking whether NumPy array is in Python list

I have a list of of numpy arrays, and I would like to check whether a given array is in the list. There is some very strange behavior with this, and I'm wondering how to get around it. Here's a simple version of the problem:

import numpy as np
x = np.array([1,1])
a = [x,1]

x in a        # Returns True
(x+1) in a    # Throws ValueError
1 in a        # Throws ValueError

I don't understand what is going on here. Is there a good workaround to this problem?

I'm working with Python 3.7.

Edit: The exact error is:

ValueError: The truth value of an array with more than one element is ambiguous.  Use a.any() or a.all()

My numpy version is 1.18.1.

Upvotes: 3

Views: 312

Answers (4)

norok2
norok2

Reputation: 26886

(EDIT: to include a more general and perhaps cleaner approach)

One way around it is to implement a NumPy safe version of in:

import numpy as np


def in_np(x, items):
    for item in items:
        if isinstance(x, np.ndarray) and isinstance(item, np.ndarray) \
                and x.shape == item.shape and np.all(x == item):
            return True
        elif isinstance(x, np.ndarray) or isinstance(item, np.ndarray):
            pass
        elif x == item:
            return True
    return False
x = np.array([1, 1])
a = [x, 1]

for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
    print(in_np(y, a))
# True
# False
# True
# False
# False

Or, even better, to write a version of in with an arbitrary comparison (possibly defaulting to the default in behavior), and then make use of np.array_equal() which has a semantic that is compliant with the expected behavior for ==. In code:

import operator


def in_(x, items, eq=operator.eq):
    for item in items:
        if eq(x, item):
            return True
    return False
x = np.array([1, 1])
a = [x, 1]

for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
    print(in_(y, a, np.array_equal))
# True
# False
# True
# False
# False

Finally, note that items can be any iterable, but the complexity of the operation will not be O(1) for hashing containers like set(), although it would still be giving correct results:

print(in_(1, {1, 2, 3}))
# True
print(in_(0, {1, 2, 3}))
# False

in_(1, {1: 2, 3: 4})
# True
in_(0, {1: 2, 3: 4})
# False

Upvotes: 1

Serge Ballesta
Serge Ballesta

Reputation: 148870

The reason is that in is more or less interpreted as

def in_sequence(elt, seq):
    for i in seq:
        if elt == i:
            return True
    return False

And 1 == x does not give False but raises an Exception because internally numpy converts it to an array of booleans. It does make sense in most contextes but here it gives a stupid behaviour.

It sounds like a bug, but is not easy to fix. Processing 1 == np.array(1, 1) the same as np.array(1, 1) == np.array(1, 1) is a major feature of numpy. And delegating equality comparisons to classes is a major feature of Python. So I cannot even imagine what should be the correct behaviour.

TL/DR: Never mix Python lists and numpy arrays because they have very different semantics and the mix leads to inconsistent corner cases.

Upvotes: 1

cicolus
cicolus

Reputation: 827

When using x in [1,x], python will compare x with each of the elements in the list, and during the comparison x == 1, the result will be a numpy array:

>>> x == 1
array([ True,  True])

and interpreting this array as a bool value will trigger the error due to inherent ambiguity:

>>> bool(x == 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Upvotes: 0

William Clavier
William Clavier

Reputation: 334

You can do it like so:

import numpy as np
x = np.array([1,1])
a = np.array([x.tolist(), 1])

x in a # True
(x+1) in a # False
1 in a # True

Upvotes: 0

Related Questions