luca
luca

Reputation: 7546

Check if a numpy array is sorted

I have a numpy array and I like to check if it is sorted.

>>> a = np.array([1,2,3,4,5])
array([1, 2, 3, 4, 5])

Upvotes: 38

Views: 32144

Answers (4)

B. M.
B. M.

Reputation: 18668

With NumPy tools:

np.all(np.diff(a) >= 0)

but numpy solutions are all O(n).

If you want quick code and very quick conclusion on unsorted arrays:

import numba
@numba.jit
def is_sorted(a):
    for i in range(a.size-1):
         if a[i+1] < a[i] :
               return False
    return True
          

which is O(1) (in mean) on random arrays.

Upvotes: 45

Dorian B.
Dorian B.

Reputation: 1299

For completeness, the O(log n) iterative solution is found below. The recursive version is slower and crashes with big vector sizes. However, it is still slower than the native numpy using np.all(a[:-1] <= a[1:]) most likely due to modern CPU optimizations. The only case where the O(log n) is faster is on the "average" random case or if it is "almost" sorted. If you suspect your array is already fully sorted then np.all will be faster.

def is_sorted(a):
    idx = [(0, a.size - 1)]
    while idx:
        i, j = idx.pop(0) # Breadth-First will find almost-sorted in O(log N)
        if i >= j:
            continue
        elif a[i] > a[j]:
            return False
        elif i + 1 == j:
            continue
        else:
            mid = (i + j) >> 1 # Division by 2 with floor
            idx.append((i, mid))
            idx.append((mid, j))
    return True

is_sorted2 = lambda a: np.all(a[:-1] <= a[1:])

Here are the results:

# Already sorted array - np.all is super fast
sorted_array = np.sort(np.random.rand(1000000))

%timeit is_sorted(sorted_array)
659 ms ± 3.28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit is_sorted2(sorted_array)
431 µs ± 35.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# Here I included the random in each command so we need to substract it's timing
%timeit np.random.rand(1000000)
6.08 ms ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit is_sorted(np.random.rand(1000000))
6.11 ms ± 58.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Without random part, it took 6.11 ms - 6.08 ms = 30µs per loop

%timeit is_sorted2(np.random.rand(1000000))
6.83 ms ± 75.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Without random part, it took 6.83 ms - 6.08 ms = 750µs per loop

Net, a O(n) vector optimized code is better than an O(log n) algorithm, unless you will run >100 million element arrays.

Upvotes: 2

luca
luca

Reputation: 7546

np.all(a[:-1] <= a[1:])

Examples:

is_sorted = lambda a: np.all(a[:-1] <= a[1:])

>>> a = np.array([1,2,3,4,9])
>>> is_sorted(a)
True

>>> a = np.array([1,2,3,4,3])
>>> is_sorted(a)
False

Upvotes: 62

1&#39;&#39;
1&#39;&#39;

Reputation: 27115

The inefficient but easy-to-type solution:

(a == np.sort(a)).all()

Upvotes: 7

Related Questions