Cython boolean indexing optimization

Question

What is the best way to convert the following code to cython

Given the following example:

#setup example data Z and A
Z = np.random.randn(10,10)
A = np.random.randn(10,10)
A[0,1] = np.nan
A[1,3] = np.nan
A[5,3] = np.nan
A[3,5] = np.nan

B = np.isnan(A).transpose()

C = Z[B * B.transpose()]

I want to optimize the type definition of np.ndarray B in the above example and optimize the creation of ndarray C.

I tried using setting B to uint8 and python and c++ bools.

cdef np.ndarray[np.uint8_t, ndim=2, cast=True] however this yields little or no speedup

and

cdef np.ndarray[bool, ndim=2, cast=True]

where bool is either from cpython cimport bool or from libcpp cimport bool in both cases the above code will throw an error.

Saullo G. P. Castro · Accepted Answer

The right way to create a buffer which will take np.nan values is using np.float_t or np.double_t. If you try using a integer buffer the following error will raise:

ValueError: cannot convert float Nan to integer

Then, you could use something like:

cdef np.ndarray[np.double_t, ndim=2] A, Z

Z = np.random.randn(10,10)
A = np.random.randn(10,10)
A[0,1] = np.nan
A[1,3] = np.nan
A[5,3] = np.nan
A[3,5] = np.nan
B = np.isnan(A).transpose()
C = Z[B * B.transpose()]

Cython boolean indexing optimization

Answers (1)

Related Questions