Rasool Ziafaty
Rasool Ziafaty

Reputation: 180

Cython vs Python benchmark

I'm trying to get a benchmark from cython vs pure python with bubble sort algorithm, and this is my result, that shows me pure python is faster than cython is my result correct or something is wrong with my code?

The script tries to create a random list and then try to sort it with bubble sort algorithm in cython and pure python.

benchmark.py

import cybubble
import pybubble
from timeit import default_timer as timer
from decimal import *
import random
getcontext().prec = 8

if __name__ == '__main__':
    random_list = [random.randrange(0,10) for i in range(9999)] #generate a random list
    cy_start = timer()   #start timer
    cybubble.bubble_sort(random_list)   #execute cython bubble sort
    cy_end = timer()     #stop timer

    py_start = timer()   #start timer
    pybubble.bubble_sort(random_list)   #execute pure python bubble sort
    py_end = timer()     #stop time

    print "Python execution time:"
    python = (Decimal(py_end) - Decimal(py_start))
    print (python)

    print "Cython execution time:"
    cython = (Decimal(cy_end) - Decimal(cy_start))
    print (cython)

cybubble.pyx

cpdef bubble_sort(collection):
    cdef int length
    cdef int i
    cdef int j
    length = len(collection)
    for i in range(length-1):
        swapped = False
        for j in range(length-1-i):
            if collection[j] > collection[j+1]:
                swapped = True
                collection[j], collection[j+1] = collection[j+1], collection[j]
        if not swapped:
            break  # Stop iteration if the collection is sorted.
    return collection

pybubble.py

def bubble_sort(collection):
    length = len(collection)
    for i in range(length-1):
        swapped = False
        for j in range(length-1-i):
            if collection[j] > collection[j+1]:
                swapped = True
                collection[j], collection[j+1] = collection[j+1], collection[j]
        if not swapped:
            break  # Stop iteration if the collection is sorted.
    return collection

setup.py

# several files with ext .pyx, that i will call by their name
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

ext_modules=[
    Extension("cybubble",       ["cybubble.pyx"]),
]

setup(
  name = 'MyProject',
  cmdclass = {'build_ext': build_ext},
  ext_modules = ext_modules,
)

command to compile the cython module: python setup.py build_ext --inplace

result:

Python execution time:
0.0014050007
Cython execution time:
1.5976260

The result shows me that the pure python is faster than cython in this example and cython is not always faster than pure python.

I expect better performance for cython than pure python in this script.

This is how i SOLVED my problem in benchmark.py

import cybubble
import pybubble
from timeit import default_timer as timer
from decimal import *
import random
import copy
getcontext().prec = 8

if __name__ == '__main__':
    random_list = [random.randrange(0,10) for i in range(9999)]
    random_list2 = copy.copy(random_list)
    random_list3 = copy.copy(random_list)


    cy_start = timer()
    cybubble.bubble_sort(random_list2)
    cy_end = timer()

    py_start = timer()
    pybubble.bubble_sort(random_list3)
    py_end = timer()


    print "Python execution time:"
    python = (Decimal(py_end) - Decimal(py_start))
    print (python)

    print "Cython execution time:"
    cython = (Decimal(cy_end) - Decimal(cy_start))
    print (cython)

Result:

Python execution time:
8.9958801
Cython execution time:
1.6466939

Upvotes: 3

Views: 1518

Answers (1)

blhsing
blhsing

Reputation: 107005

This is because you sort the list in-place--you sort the list with Cython first, when the list is randomized, and then sort the list with Python, when the list is already sorted. Since bubble sort has a best-case time complexity of O(n) when the list is already sorted, the call to the Python's version, when the list is already sorted, is magnitudes faster than the call to the Cython's version, when the list is randomized, which takes an average time complexity of O(n ^ 2).

You should sort copies of the list if you would like to benchmark the two implementations properly.

Upvotes: 3

Related Questions