Reputation: 2257
I want to compare two numpy array one element by one element taking consider of the position. For example
[1, 2, 3]==[1, 2, 3] -> True
[1, 2, 3]==[2, 1, 3] -> False
I tried the following
for index in range(list1.shape[0]):
if list1[index] != list2[index]:
return False
return True
But I got the following error
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
However the following is not the correct usage of .any or .all
numpy.any(numpy.array([1,2,3]), numpy.array([1,2,3]))
numpy.all(numpy.array([1,2,3]), numpy.array([1,2,3]))
As it returns
TypeError: only length-1 arrays can be converted to Python scalars
I am very confused, can someone explain what I am doing wrong
Thanks
Upvotes: 2
Views: 876
Reputation: 375435
You can also use array_equal
:
In [11]: a = np.array([1, 2, 3])
In [12]: b = np.array([2, 1, 3])
In [13]: np.array_equal(a, a)
Out[13]: True
In [14]: np.array_equal(a, b)
Out[14]: False
This ought to be more efficient since you don't need to keep the temporary a==b
...
Note: a little about performance, for larger arrays you want to be using np.all
rather than all
. array_equal
performs about the same unless the arrays differ early, then it is much faster as it can fail early:
In [21]: a = np.arange(100000)
In [22]: b = np.arange(100000)
In [23]: c = np.arange(1, 100000)
In [24]: %timeit np.array_equal(a, a) # Note: I expected this to check is first, it doesn't
10000 loops, best of 3: 183 µs per loop
In [25]: %timeit np.array_equal(a, b)
10000 loops, best of 3: 189 µs per loop
In [26]: %timeit np.array_equal(a, c)
100000 loops, best of 3: 5.9 µs per loop
In [27]: %timeit np.all(a == b)
10000 loops, best of 3: 184 µs per loop
In [28]: %timeit np.all(a == c)
10000 loops, best of 3: 40.7 µs per loop
In [29]: %timeit all(a == b)
100 loops, best of 3: 3.69 ms per loop
In [30]: %timeit all(a == c) # ahem!
# TypeError: 'bool' object is not iterable
Upvotes: 3
Reputation: 121966
You can pass an array of booleans to all
, for example:
>>> import numpy as np
>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 1, 3])
>>> a == b
array([False, False, True], dtype=bool)
>>> np.all(a==b) # also works with all for 1D arrays
False
Note that the built-in all
is much faster than np.all
for small arrays (and np.array_equal
is slower still):
>>> timeit.timeit("all(a==b)", setup="import numpy as np; a = np.array([1, 2, 3]); b = np.array([2, 1, 3])")
0.8798369040014222
>>> timeit.timeit("np.all(a==b)", setup="import numpy as np; a = np.array([1, 2, 3]); b = np.array([2, 1, 3])")
9.980971871998918
>>> timeit.timeit("np.array_equal(a, b)", setup="import numpy as np; a = np.array([1, 2, 3]); b = np.array([2, 1, 3])")
13.838635700998566
but will not work correctly with multidimensional arrays:
>>> a = np.arange(9).reshape(3, 3)
>>> b = a.copy()
>>> b[0, 0] = 42
>>> all(a==b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>>> np.all(a==b)
False
For larger arrays, np.all
is fastest:
>>> timeit.timeit("np.all(a==b)", setup="import numpy as np; a = np.arange(1000); b = a.copy(); b[999] = 0")
13.581198551000853
>>> timeit.timeit("all(a==b)", setup="import numpy as np; a = np.arange(1000); b = a.copy(); b[999] = 0")
30.610838356002205
>>> timeit.timeit("np.array_equal(a, b)", setup="import numpy as np; a = np.arange(1000); b = a.copy(); b[999] = 0")
17.95089965599982
Upvotes: 2