Reputation: 8556
I have a function that calculates a matrix for me but it is really slow. Even in cython it is running slow, so I was wondering if one could do anything to enhance the below code.
EDIT: I've changed or added
des = np.zeros([n-m+1,m])
to cdef np.ndarray des = np.zeros([n-m+1,m], dtype=DTYPE)
(This is faster than np.empty...
Instead of saying m/2
I've added a cdef int m2 = m/2
but that didn't seemed to help anything.
cimport numpy as np
cimport cython
DTYPE = float
ctypedef np.float_t DTYPE_t
@cython.boundscheck(False)
@cython.cdivision(True)
@cython.wraparound(False)
cpdef map4(np.ndarray[DTYPE_t, ndim=1] s, int m):
cdef int n = len(s)
cdef int i
cdef int j
des = np.zeros([n-m+1,m])
for j in xrange(m):
for i in xrange(m/2,n-m/2-1):
des[i-m/2,j] = s[i-j+m/2]
return des, s, m, n
Typically n~10000
and m=1001
.
Upvotes: 2
Views: 3062
Reputation: 121
I'm not seeing m being set anywhere. At the bottom of your code, you mention that n~10,000, and m=1001. Does that mean that m is a constant integer of 32 bits? Not seeing your compilation flags, it's frequently worthwhile to try it with and without -ffast-math
to see if that makes a difference. With large arrays and matrices, using a smaller data type usually shows a significant speedup, provided that the smaller data type preserves the range and accuracy that your program needs, though I'm not seeing a large potential benefit on this calculation.
If you could show us the C code that is generated by this, that might help, as well.
Upvotes: 0
Reputation: 46578
It might also help to use np.empty
instead of np.zeros
, assuming you'll assign each element:
des = np.empty([n-m+1,m])
Upvotes: 2
Reputation: 47092
Try:
cdef np.ndarray des = np.zeros([n-m+1,m])
You can also make this more specific like you did for the parameter s. You can also turn off bounds checking. Check out the cython numpy tutorial.
You also might want to make a variable:
cdef int m_2 = m/2
and use that everywhere you have m/2
because I don't know if Cython will do that optimization for you.
Upvotes: 3