How does Numpy/Scipy turn C functions into vectorized Python functions?

Question

As I understand it, vectorized numpy functions are faster than Python loops because loops are done in C or Fortran. I would like to know where in the source code this happens.

For example, the scipy.special.bdtr binomial CDF function accepts array-like arguments k,n,p and will return an ndarray provided the arguments are broadcastable. The documentation says that scipy.special.bdtr is a wrapper for a routine in the Cephes Mathematical Functions Library. Digging through the source code on Github, I found a scipy/special/cephes/bdtr.c file containing the C code for the routine; here are what I believe to be the first three lines of the relevant C function:

double bdtr(k, n, p)

int k, n;

double p;

It appears that the underlying C function does not operate on arrays, and I can't find the source code where this function is converted to a Python function that operates on arrays.

javidcf · Accepted Answer

In the case of scipy.special functions, the C code only contains the "kernels" of the functions, that is, how to apply the function to scalars. Each of these is then wrapped into a ufunc with automatically generated Cython code. To do this, it uses C header files, like scipy/special/cephes.h, Cython declaration files, like scipy/special/_cephes.pxd, the file scipy/special/functions.json, where all the functions to be generated for scipy.special are listed, and finally scipy/special/_generate_pyx.py, which is where the Cython code is actually produced.

How does Numpy/Scipy turn C functions into vectorized Python functions?

Answers (1)

Related Questions