Reputation: 2310
I have a list of of numpy arrays, and I would like to check whether a given array is in the list. There is some very strange behavior with this, and I'm wondering how to get around it. Here's a simple version of the problem:
import numpy as np
x = np.array([1,1])
a = [x,1]
x in a # Returns True
(x+1) in a # Throws ValueError
1 in a # Throws ValueError
I don't understand what is going on here. Is there a good workaround to this problem?
I'm working with Python 3.7.
Edit: The exact error is:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
My numpy version is 1.18.1.
Upvotes: 3
Views: 312
Reputation: 26886
(EDIT: to include a more general and perhaps cleaner approach)
One way around it is to implement a NumPy safe version of in
:
import numpy as np
def in_np(x, items):
for item in items:
if isinstance(x, np.ndarray) and isinstance(item, np.ndarray) \
and x.shape == item.shape and np.all(x == item):
return True
elif isinstance(x, np.ndarray) or isinstance(item, np.ndarray):
pass
elif x == item:
return True
return False
x = np.array([1, 1])
a = [x, 1]
for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
print(in_np(y, a))
# True
# False
# True
# False
# False
Or, even better, to write a version of in
with an arbitrary comparison (possibly defaulting to the default in
behavior), and then make use of np.array_equal()
which has a semantic that is compliant with the expected behavior for ==
. In code:
import operator
def in_(x, items, eq=operator.eq):
for item in items:
if eq(x, item):
return True
return False
x = np.array([1, 1])
a = [x, 1]
for y in (x, 0, 1, x + 1, np.array([1, 1, 1])):
print(in_(y, a, np.array_equal))
# True
# False
# True
# False
# False
Finally, note that items
can be any iterable, but the complexity of the operation will not be O(1)
for hashing containers like set()
, although it would still be giving correct results:
print(in_(1, {1, 2, 3}))
# True
print(in_(0, {1, 2, 3}))
# False
in_(1, {1: 2, 3: 4})
# True
in_(0, {1: 2, 3: 4})
# False
Upvotes: 1
Reputation: 148870
The reason is that in
is more or less interpreted as
def in_sequence(elt, seq):
for i in seq:
if elt == i:
return True
return False
And 1 == x
does not give False
but raises an Exception because internally numpy converts it to an array of booleans. It does make sense in most contextes but here it gives a stupid behaviour.
It sounds like a bug, but is not easy to fix. Processing 1 == np.array(1, 1)
the same as np.array(1, 1) == np.array(1, 1)
is a major feature of numpy. And delegating equality comparisons to classes is a major feature of Python. So I cannot even imagine what should be the correct behaviour.
TL/DR: Never mix Python lists and numpy arrays because they have very different semantics and the mix leads to inconsistent corner cases.
Upvotes: 1
Reputation: 827
When using x in [1,x]
, python will compare x
with each of the elements in the list, and during the comparison x == 1
, the result will be a numpy array:
>>> x == 1
array([ True, True])
and interpreting this array as a bool
value will trigger the error due to inherent ambiguity:
>>> bool(x == 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Upvotes: 0
Reputation: 334
You can do it like so:
import numpy as np
x = np.array([1,1])
a = np.array([x.tolist(), 1])
x in a # True
(x+1) in a # False
1 in a # True
Upvotes: 0