Reputation: 21264
In numpy, is there a nice idiomatic way of testing if all rows are equal in a 2d array?
I can do something like
np.all([np.array_equal(M[0], M[i]) for i in xrange(1,len(M))])
This seems to mix python lists with numpy arrays which is ugly and presumably also slow.
Is there a nicer/neater way?
Upvotes: 27
Views: 19219
Reputation: 53
For Alex's answer about nan
, we have now,
np.isclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
np.allclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
Upvotes: 1
Reputation: 176750
One way is to check that every row of the array arr
is equal to its first row arr[0]
:
(arr == arr[0]).all()
Using equality ==
is fine for integer values, but if arr
contains floating point values you could use np.isclose
instead to check for equality within a given tolerance:
np.isclose(a, a[0]).all()
If your array contains NaN
and you want to avoid the tricky NaN != NaN
issue, you could combine this approach with np.isnan
:
(np.isclose(a, a[0]) | np.isnan(a)).all()
Upvotes: 33
Reputation: 3893
It is worth mentioning that the above version will not work for multidimensional arrays.
For example: for a three-dimensional square image tensor img
[256, 256, 3] , we need to check whether the same RGB [256, 256] layers in the image or not.
In this case, we need to use broadcasting
(img == img[:, :, 0, np.newaxis]).all()
Because simple img[:, :, 0]
gives us [256, 256], but we need [256, 256, 1] to broadcast through layers.
Upvotes: 6
Reputation: 250901
Simply check if the number if unique items in the array are 1:
>>> arr = np.array([[1]*10 for _ in xrange(5)])
>>> len(np.unique(arr)) == 1
True
A solution inspired from unutbu's answer:
>>> arr = np.array([[1]*10 for _ in xrange(5)])
>>> np.all(np.all(arr == arr[0,:], axis = 1))
True
One problem with your code is that you're creating an entire list first before applying np.all()
on it. Due to that there's no short-circuiting happening in your version, instead of that it would be better if you use Python's all()
with a generator expression:
Timing comparisons:
>>> M = arr = np.array([[3]*100] + [[2]*100 for _ in xrange(1000)])
>>> %timeit np.all(np.all(arr == arr[0,:], axis = 1))
1000 loops, best of 3: 272 µs per loop
>>> %timeit (np.diff(M, axis=0) == 0).all()
1000 loops, best of 3: 596 µs per loop
>>> %timeit np.all([np.array_equal(M[0], M[i]) for i in xrange(1,len(M))])
100 loops, best of 3: 10.6 ms per loop
>>> %timeit all(np.array_equal(M[0], M[i]) for i in xrange(1,len(M)))
100000 loops, best of 3: 11.3 µs per loop
>>> M = arr = np.array([[2]*100 for _ in xrange(1000)])
>>> %timeit np.all(np.all(arr == arr[0,:], axis = 1))
1000 loops, best of 3: 330 µs per loop
>>> %timeit (np.diff(M, axis=0) == 0).all()
1000 loops, best of 3: 594 µs per loop
>>> %timeit np.all([np.array_equal(M[0], M[i]) for i in xrange(1,len(M))])
100 loops, best of 3: 9.51 ms per loop
>>> %timeit all(np.array_equal(M[0], M[i]) for i in xrange(1,len(M)))
100 loops, best of 3: 9.44 ms per loop
Upvotes: 5