Reputation: 483
I'm trying to use Cython to speed up some expensive Python for loops in some numerical code, but have run into an issue where I'm seeing almost no speedup, and think I may have to cythonize a lot more of my code than I had hoped.
As an example, let's say I have the following two functions that have been cythonized, and are part of a larger class:
def update(self, double[::1] state_data, double[::1] sensor_data, double sigma):
cdef int i
cdef int N = state_data.shape[0]
for i in range(N):
self.output[i] = self.process_sensor_data(state_data[i], sensor_data, sigma)
def process_sensor_data(self, double current_state, double[::1] sensor_data, double sigma):
cdef int i
cdef int N = sensor_data.shape[0]
cdef double x
cdef double y
for i in range(N):
x += self.do_something(current_state, sigma)
y += self.do_something_else(sensor_data)
return min(x,y)
As seen above, the update()
function takes in some numpy arrays (double[::1]
) and a floating point number, and then runs a for loop that calls the process_sensor_data()
function. The process_sensor_data()
function then runs its own for loop, which calls two additional functions called do_something()
and do_something_else()
, that have been defined somewhere else in the class.
Now let's assume that I can fully cythonize the do_something()
function, and can thus define it as a fast cdef
function with a function header such as,
cdef double do_something(self, double current_state, double sigma):
...
but I'm not able to cythonize the do_something_else()
function (e.g. maybe it calls some functions from the numpy or scipy libraries). Would this imply that the for loops inside process_sensor_data()
and update()
would still run at similar speeds to vanilla Python for loops, and not see much of a speedup from Cython?
Put another way, if I cythonize a for loop similarly to what was done above, but there are some function calls and/or calculations inside the for loop that cannot be cythonized (i.e. if Cython's html annotation output shows some yellow code lines in the for loop), does this mean that I won't see much of a speedup when using Cython?
From my own experiments, this unfortunately seems to be the case, but I wanted to make sure that I'm not going crazy. In my code I have a function that takes about 20 seconds to execute in Python, but still takes about 20 seconds to execute after I've tried to cythonize the slow for loops. Having spent quite a few hours going down the rabbit hole of trying to cythonize as many of the variables and functions that are being called within the for loops as possible, I'm starting to think it would be simpler and more readable to implement the class in C++. Any help or guidance would be appreciated, thanks.
Upvotes: 1
Views: 389
Reputation: 7167
Cython isn't always faster, especially if you continue using Python datatypes instead of C datatypes. Also, be aware that conversions between Python and C datatypes can happen implicitly in Cython, and can be expensive.
You could also take a look at numba's nopython decorator, and pypy3.
Upvotes: 1