Marlon
Marlon

Reputation: 47

List comprehension with cython

I am trying to speed up my Python code with Cython, and so far it is working great. I am having however one single problem: dealing with lists.

Using cython -a myscript.pyx, I can see that the only parts of my code that call Python routines are when I'm dealing with lists.

For example, I have a numpy array (sel1) that I need to split like this:

x1 = numpy.array([t[0] for t in sel1])
y1 = numpy.array([t[1] for t in sel1])
z1 = numpy.array([t[2] for t in sel1])

and I have no idea how to speed this up with Cython.

Another occurence is when using list/array indexes, like this:

cdef numpy.ndarray[DTYPE_t, ndim=2] init_value_1 = coords_1[0], init_value_2 = coords_2[0]

I am aware that what takes time is the Python routines that are used to access the parts of the lists I need. I currently have no idea how to speed this up though.

Upvotes: 3

Views: 3397

Answers (1)

ali_m
ali_m

Reputation: 74172

Manipulating lists in Cython is inherently more expensive than using numpy arrays or typed memoryviews, since the former necessitates making Python API calls, whereas with the latter it's possible to directly address the underlying C memory buffers. The best way to avoid this overhead is to simply not use lists wherever possible.

You shouldn't really be using list comprehensions to split your sel1 array anyway - it will be much faster to simply index into the columns:

x1 = sel1[:, 0]
x2 = sel1[:, 1]
x3 = sel1[:, 2]

Creating new numpy arrays in Cython will always incur some Python overhead, since they are allocated on the Python heap and accounted for by Python's memory management system. That line might be more expensive than it needs to be if coords1 or coords2 is a list or tuple rather than a numpy array.

Upvotes: 4

Related Questions