Reputation: 25
I am trying to parallelize the following code which includes numerous numpy array operations
#fft_fit.pyx
import cython
import numpy as np
cimport numpy as np
from cython.parallel cimport prange
from libc.stdlib cimport malloc, free
dat1 = np.genfromtxt('/home/bagchilab/Sumanta_files/fourier_ecology_sample_data_set.csv',delimiter=',')
dat = np.delete(dat1, 0, 0)
yr = np.unique(dat[:,0])
fit_dat = np.empty([1,2])
def fft_fit_yr(np.ndarray[double, ndim=1] yr, np.ndarray[double, ndim=2] dat, int yr_idx, int pix_idx):
cdef np.ndarray[double, ndim=2] yr_dat1
cdef np.ndarray[double, ndim=2] yr_dat
cdef np.ndarray[double, ndim=2] fft_dat
cdef np.ndarray[double, ndim=2] fft_imp_dat
cdef int len_yr = len(yr)
for i in prange(len_yr ,nogil=True):
with gil:
yr_dat1 = dat[dat[:,yr_idx]==yr[i]]
yr_dat = yr_dat1[~np.isnan(yr_dat1).any(axis=1)]
print "index" ,i
y_fft = np.fft.fft(yr_dat[:,pix_idx])
y_fft_abs = np.abs(y_fft)
y_fft_freq = np.fft.fftfreq(len(y_fft), 1)
x_fft = range(len(y_fft))
fft_dat = np.column_stack((y_fft, y_fft_abs))
cut_off_freq = np.percentile(y_fft_abs, 25)
imp_freq = np.array(y_fft_abs[y_fft_abs > cut_off_freq])
fft_imp_dat = np.empty((1,2))
for j in range(len(imp_freq)):
freq_dat = fft_dat[fft_dat[:, 1]==imp_freq[j]]
fft_imp_dat = np.vstack((fft_imp_dat , freq_dat[0,:]))
fft_imp_dat = np.delete(fft_imp_dat, 0, 0)
fit_dat1 = np.fft.ifft(fft_imp_dat[:,0])
fit_dat2 = np.column_stack((fit_dat1.real, [yr[i]] * len(fit_dat1)))
fit_dat = np.concatenate((fit_dat, fit_dat2), axis = 0)
I have used the following code for setup.py
####setup.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("fft_fit_yr", ["fft_fit.pyx"])]
extra_compile_args=['-fopenmp'],
extra_link_args=['-fopenmp'])]
)
But I am getting the following error when I compile the fft_fit.pyx in cython:
for i in prange(len_yr ,nogil=True):
target may not be a Python object as we don't have the GIL
Please let me know where I am going wrong while using prange function. Thanks.
Upvotes: 0
Views: 2260
Reputation: 74172
You can't (at least not using Cython).
Numpy functions operate on Python objects and therefore require the GIL, which prevents multiple native threads from executing in parallel. If you compile your code using cython -a
, you will get an annotated HTML file which shows where Python C-API calls are being made (and therefore where the GIL can't be released).
Cython is most useful where you have a specific bottleneck in your code that cannot be easily speeded up using vectorization. If your code is already spending most of its time in numpy function calls then calling those exact same functions from Cython is not going to result in any significant performance improvement. In order to see a noticeable difference you would need to write some or all of your array operations as explicit for
loops. However it looks to me as though there are much simpler optimizations that could be made to your code.
I suggest that you do the following:
line_profiler
) to see where the bottlenecks are.joblib
or multiprocessing
. Parallelization is usually the last tool to reach for once you've already tried everything else you can think of.Upvotes: 4