Reputation: 16987
I am trying to optimize my cython code and there seems to be quite a bit of room for improvement here is part of the profile from %prun extension in the IPython notebook:
7016695 function calls in 18.475 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
400722 7.723 0.000 15.086 0.000 _methods.py:73(_var)
814815 4.190 0.000 4.190 0.000 {method 'reduce' of 'numpy.ufunc' objects}
1 1.855 1.855 18.475 18.475 {_cython_magic_aed83b9d1a706200aa6cef0b7577cf41.knn_alg}
403683 0.838 0.000 1.047 0.000 _methods.py:39(_count_reduce_items)
813031 0.782 0.000 0.782 0.000 {numpy.core.multiarray.array}
398748 0.611 0.000 15.485 0.000 fromnumeric.py:2819(var)
804405 0.556 0.000 1.327 0.000 numeric.py:462(asanyarray)
Seeing as my program is spending almost 8 seconds just calculating the variance I am hoping that this will be able to be sped up
I am calculating the variance using np.var() of a 1D array length 404 ~1000 times. I checked the C standard library and unfortunately there is no function for this and I don't want to write my own in C.
1.Is there any other option?
2.Any way to reduce time spent in the second item on the list?
Here is my code if it helps to see:
cpdef knn_alg(np.ndarray[double, ndim=2] temp, np.ndarray[double, ndim=1] jan1, int L, int w, int B):
cdef np.ndarray[double, ndim=3] lnn = np.zeros((L+1,temp.shape[1],365))
lnn = lnn_alg(temp, L, w)
cdef np.ndarray[double, ndim=2] sim = np.zeros((len(temp),temp.shape[1]))
cdef np.ndarray [double, ndim=2] a = np.zeros((L+1,lnn.shape[1]))
cdef int b
cdef np.ndarray [double, ndim=2] c = np.zeros((L,lnn.shape[1]-3))
cdef np.ndarray [double, ndim=2] lnn_scale = np.zeros((L,lnn.shape[1]))
cdef np.ndarray [double, ndim=2] cov_t = np.zeros((3,3))
cdef np.ndarray [double, ndim=2] dk = np.zeros((L,4))
cdef int random_selection
cdef np.ndarray [double, ndim=1] day_month
cdef int day_of_year
cdef np.ndarray [double, ndim=2] lnn_scaled
cdef np.ndarray [double, ndim=2] temp_scaled
cdef np.ndarray [double, ndim=2] eig_vec
cdef double PC_t
cdef np.ndarray [double, ndim=1] PC_l
cdef double K
cdef np.ndarray[double, ndim=2] knn
cdef np.ndarray[double, ndim=1] val
cdef np.ndarray[double, ndim=1] pn
cdef double rand_num
cdef int nn
cdef int index
cdef int inc
cdef int i
sim[0,:] = jan1
for i in xrange(1,len(temp),B):
#If leap day then randomly select feb 28 or mar 31
if (temp[i,4]==2) & (temp[i,3]==29):
random_selection = np.random.randint(0,1)
day_month = np.array([[29,2],[1,3]])[random_selection]
else:
day_month = temp[i,3:5]
#Convert day month to day of year for L+1 nearest neighbors selection
current = datetime.datetime(2014, (<int>day_month[1]), (<int>day_month[0]))
day_of_year = current.timetuple().tm_yday - 1
#Take out current day from L+1 nearest neighbors
a = lnn[:,:,day_of_year]
b = np.where((a[:,3:6] == temp[i,3:6]).all(axis=-1))[0][0]
c = np.delete(a,(b), axis=0)
#Scale and center data from nearest neighbors and spatially averaged historical data
lnn_scaled = scale(c[:,0:3])
temp_scaled = scale(temp[:,0:3])
#Calculate covariance matrix of nearest neighbors
cov_t[:,:] = np.cov(lnn_scaled.T)
#Calculate eigenvalues and vectors of covariance matrix
eig_vec = eig(cov_t)[1]
#Calculate principal components of scaled L nearest neighbors and
PC_t = np.dot(temp_scaled[i],eig_vec[0])
PC_l = np.dot(lnn_scaled,eig_vec[0])
#Calculate mahalonobis distance
dk = np.zeros((404,4))
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
dk[:,1:4] = c[:,3:6]
#Extract K nearest neighbors
dk = dk[dk[:,0].argsort()]
K = round(sqrt(L),0)
knn = dk[0:(<int>K)]
#Create probility density function
val = np.array([1.0/k for k in range(1,len(knn)+1)])
wk = val/(<int>val.sum())
pn = wk.cumsum()
#Select next days value from KNNs using probability density function with random value
rand_num = np.random.rand(1)[0]
nn = (abs(pn-rand_num)).argmin()
index = np.where((temp[:,3:6] == knn[nn,1:4]).all(axis=-1))[0][0]
if i+B > len(temp):
inc = len(temp) - i
else:
inc = B
if (index+B > len(temp)):
index = len(temp)-B
sim[i:i+inc,:] = temp[index:index+inc,:]
return sim
The variance calculation is in this line:
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
Any advice would be very helpful as I am quite new to cython
Upvotes: 3
Views: 889
Reputation: 16987
I went through said calculation and I think the reason that it was going so slow is that I was using np.var() which is a python (or numpy) function and does not allow the loop to be compiled in C. If anyone knows how to do that while using numpy let me know.
What I ended up doing was coding the calculation from this:
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
to this as a separate function:
cimport cython
cimport numpy as np
import numpy as np
from libc.math cimport sqrt as csqrt
from libc.math cimport pow as cpow
@cython.boundscheck(False)
@cython.cdivision(True)
cdef cy_mahalanobis(np.ndarray[double, ndim=1] PC_l, double PC_t):
cdef unsigned int i,j,L
L = PC_l.shape[0]
cdef np.ndarray[double] dk = np.zeros(L)
cdef double x,total,mean,var
total = 0
for i in xrange(L):
x = PC_l[i]
total = total + x
mean = total / L
total = 0
for i in xrange(L):
x = cpow(PC_l[i]-mean,2)
total = total + x
var = total / L
for j in xrange(L):
dk[j] = csqrt(cpow(PC_t-PC_l[j],2)/var)
return dk
And because I am not calling any any python functions (including numpy) the entire loop was able to be compiled in C (no yellow lines when using the annotate option cython -a file.pyx
or %%cython -a
for the Ipython notebook).
Overall my code ended up being an order of magnitude faster! WELL worth the effort coding this by hand! My cython (and python for that matter) are not the greatest so any additional suggestions or answers would be appreciated.
Upvotes: 3
Reputation: 3709
It might help to make sure your for
loop
dk[:,0] = np.array([sqrt((PC_t-pc)**2/np.var(PC_l)) for pc in PC_l])
is being converted correctly by looking at the generated C code. Check out this link to the Cython docs.
If not, it might help to make sure you declare pc
as a cdef type to make sure no python objects are referred to. (Another link to the docs)
Upvotes: 1