baqyoteto
baqyoteto

Reputation: 344

Can Cython further reduce Python method calling overhead for this function?

I have a function that can be called many times (10^6+) depending on end-user input. According to cProfile the function itself executes quickly but the number of calls is hurting performance.

Here's a min case:

# condition_counter.pyx
# cython: profile=True

import cProfile
import pstats
import pyximport

pyximport.install()

USER_DEFINED_NUM = 10
USER_DEFINED_SPECIAL_VALUES = 1, 3, 8


def condition_met(number):
    value = USER_DEFINED_NUM % number
    return value in USER_DEFINED_SPECIAL_VALUES


cdef cy_condition_met(number):
    value = USER_DEFINED_NUM % number
    return value in USER_DEFINED_SPECIAL_VALUES


def condition_counter(end_number):
    current_number = 1
    special_nums = [num for num in range(current_number, end_number) if condition_met(num)]
    return len(special_nums)

def cy_condition_counter(end_number):
    current_number = 1
    special_nums = [num for num in range(current_number, end_number) if cy_condition_met(num)]
    return len(special_nums)

Above isn't my actual code, it's just a small example that shows the optimization problem that I have. When I profile the Cython and Python versions, I see very minimal differences.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  ...
  9999999    2.117    0.000    2.117    0.000 min_case_py_overhead.pyx:13(condition_met)
  ...

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  ...
  9999999    2.090    0.000    2.090    0.000 min_case_py_overhead.pyx:18(cy_condition_met)
  ...

From the percall stat, the content of the Python and Cython functions execute equally fast. This is why I suspect Python overhead is the problem. It's also why I don't think PyPy will help.

Is there any way to further reduce the overhead? I tried statically declaring variables but that slows things down sometimes. I would welcome performance improvements outside of Cython. My main problem is calling a function many, many times. Reducing the call count is not an option in my scenario.

Upvotes: 1

Views: 827

Answers (1)

tdelaney
tdelaney

Reputation: 77387

You can reduce the overhead of python objects by cdefing everything. I removed the profiling code in favor of a separate module timing 10M runs of the function and cut 90% off of the run time. Here are your exsisting functions and new ones beginning with "cp".

condition_counter.pyx

USER_DEFINED_NUM = 10
USER_DEFINED_SPECIAL_VALUES = 1, 3, 8

def condition_met(number):
    value = USER_DEFINED_NUM % number
    return value in USER_DEFINED_SPECIAL_VALUES

cdef cy_condition_met(number):
    value = USER_DEFINED_NUM % number
    return value in USER_DEFINED_SPECIAL_VALUES

def condition_counter(end_number):
    current_number = 1
    special_nums = [num for num in range(current_number, end_number) if condition_met(num)]
    return len(special_nums)

def cy_condition_counter(end_number):
    current_number = 1
    special_nums = [num for num in range(current_number, end_number) if cy_condition_met(num)]
    return len(special_nums)

#----------------------------------------------------------------------
# Really go down the cython path
#----------------------------------------------------------------------

cdef int CP_USER_DEFINED_NUM = 10
cdef int CP_USER_DEFINED_SPECIAL_VALUES[3]
CP_USER_DEFINED_SPECIAL_VALUES = [1, 3, 8]

cdef int cp_condition_met(int number):
    cdef int value = CP_USER_DEFINED_NUM % number
    return value in CP_USER_DEFINED_SPECIAL_VALUES

cpdef int cp_condition_counter(int end_number):
    cdef int current_number = 1
    cdef int num
    cdef int count = 0
    for num in range(current_number, end_number):
        if cp_condition_met(num):
            count += 1
    return count

The test script

#!/usr/bin/env python3

import condition_counter
from time import perf_counter

iterations = 10_000_000

start = perf_counter()
result = condition_counter.condition_counter(iterations)
delta = perf_counter()-start
print("py", delta)

start = perf_counter()
result = condition_counter.cy_condition_counter(iterations)
delta = perf_counter()-start
print("cy", delta)

start = perf_counter()
result = condition_counter.cp_condition_counter(iterations)
delta = perf_counter()-start
print("cp", delta)

And performance numbers

py 0.6689409520004119
cy 0.5783118550007202
cp 0.03368412400050147

Upvotes: 2

Related Questions