Fastest way to sort in Python (no cython)

Question

I have a problem where I've to sort a very big array(shape - 7900000X4X4) with a custom function. I used sorted but it took more than 1 hour to sort. My code was something like this.

def compare(x,y):
    print('DD '+str(x[0]))
    if(np.array_equal(x[1],y[1])==True):
        return -1
    a = x[1].flatten()
    b = y[1].flatten()
    idx = np.where( (a>b) != (a=0:
        return 0
    elif b[idx]<0 and a[idx]>=0:
        return 1
    elif a[idx]<0 and b[idx]<0:
        if a[idx]>b[idx]:
            return 0
        elif a[idx]



This worked but I want it to complete in seconds. I don't think any direct implementation in python can give me the performance I need, so I tried cython. My Cython code is this, which is pretty simple.

cdef int[:,:] arrr
cdef int size

cdef bool compare(int a,int b):
    global arrr,size
    cdef int[:] x = arrr[a]
    cdef int[:] y = arrr[b]
    cdef int i,j
    i = 0
    j = 0
    while(i


This code in cython took 33 seconds! Cython is the solution, but I am looking for some alternate solutions which can run directly on python. For example numba. I tried Numba, but I didn't get satisfying results. Kindly help!

max9111 · Accepted Answer

It is hard to give an answer without a working example. I assume, that arrr in your Cython code was a 2D-array and I assume that size was size=arrr.shape[0]

Numba Implementation

import numpy as np
import numba as nb
from numba.targets import quicksort


def custom_sorting(compare_fkt):
  index_arange=np.arange(size)

  quicksort_func=quicksort.make_jit_quicksort(lt=compare_fkt,is_argsort=False)
  jit_sort_func=nb.njit(quicksort_func.run_quicksort)
  index=jit_sort_func(index_arange)

  return index

def compare(a,b):
    x = arrr[a]
    y = arrr[b]
    i = 0
    j = 0
    while(i



This gives 3.85s for the generated testdata. But the speed of a sorting algorithm heavily depends on the data....

Simple Example

import numpy as np
import numba as nb
from numba.targets import quicksort

#simple reverse sort
def compare(a,b):
  return a > b

#create some test data
arrr=np.array(np.random.rand(7900000)*10000,dtype=np.int32)
#we can pass the comparison function
quicksort_func=quicksort.make_jit_quicksort(lt=compare,is_argsort=True)
#compile the sorting function
jit_sort_func=nb.njit(quicksort_func.run_quicksort)
#get the result
ind_sorted=jit_sort_func(arrr)


This implementation is about 35% slower than np.argsort, but this is also common in using np.argsort in compiled code.

Fastest way to sort in Python (no cython)

Answers (2)

Related Questions