Reputation: 19
I am trying to speed up the following code with Cython. Two arguments are imported from python:
processing_times: List with 500 lists of 20 integers each (500x20).
sequence: List with 500 integers.
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef taillard_acceleration(sequence, processing_times, int job_inserted, int num_machines):
# Static arrays - number of jobs limited to 500 jobs and 20 machines
cdef int e[501][21]
cdef int q[501][21]
cdef int f[501][21]
cdef int ms[501]
# Variables
cdef int sequence_length, best_makespan, best_position
cdef int i, j, iq, jq, tmp
# Initialize some values
sequence_length = len(sequence)
iq = sequence_length + 1
for i in range(1, sequence_length + 2):
if i < sequence_length + 1:
e[i][0] = 0
# Q index I
iq = iq - 1
q[iq][num_machines + 1] = 0
f[i][0] = 0
jq = num_machines + 1
for j in range(1, num_machines + 1):
if i == 1:
e[0][j] = 0
q[sequence_length + 1][num_machines + 1 - j] = 0
if i < sequence_length + 1:
# Q Index J
jq = jq - 1
if e[i][j - 1] > e[i - 1][j]:
e[i][j] = e[i][j - 1] + processing_times[sequence[i - 1]-1][j-1]
else:
e[i][j] = e[i - 1][j] + processing_times[sequence[i - 1]-1][j-1]
if q[iq][jq + 1] > q[iq + 1][jq]:
q[iq][jq] = q[iq][jq + 1] + processing_times[sequence[iq - 1]-1][jq-1]
else:
q[iq][jq] = q[iq + 1][jq] + processing_times[sequence[iq - 1]-1][jq-1]
# f(ij) = max {f(i, j-1), e(i-1, j)}
if f[i][j - 1] > e[i - 1][j]:
f[i][j] = f[i][j - 1] + processing_times[job_inserted-1][j-1]
else:
f[i][j] = e[i - 1][j] + processing_times[job_inserted-1][j-1]
# Makespam - job k in position i
best_makespan = 0
best_position = 0
for i in range(1, sequence_length + 2):
ms[i] = 0
for j in range(1, num_machines + 1):
tmp = f[i][j] + q[i][j]
if tmp > ms[i]:
ms[i] = tmp
# Check best insertion position
if ms[i] < best_makespan or best_makespan == 0:
best_makespan = ms[i]
best_position = i
return best_position, best_makespan
I was able to speed up the process 4x times over the original python code:
Just Python: 0.04114614830813535
With Cython: 0.00937230621550278
Cython is 4.390183948543561 times faster
How can i get a better speed improvement in this code?
I have already tryed to convert sequence and processing_times to numpy arrays outside and then use Memory views, but I got no improvement in that.
cpdef taillard_acceleration(sequence_np, processing_times_np, int job_inserted, int num_machines):
# memory view
cdef int [:, :] processing_times = processing_times_np
cdef int [:] sequence = sequence_np
Also should I use malloc for q,e,f,ms arrays? First time using Cython, so i dont know if im doing this right. Any help is much appreciated.
Upvotes: 1
Views: 253
Reputation: 30890
Most of it looks like it's properly typed and so you're unlikely to get huge improvements. The main thing that isn't typed are sequence
and processing_times
. You should make these memoryviews:
def taillard_acceleration(int[:] sequence, int[:,:] processing_times, int job_inserted, int num_machines):
I know you already tried this, however you should also change their indexing to be of the form processing_times[i,j]
(rather than processing_times[i][j]
). What you're doing creates an 1D memoryview as a temporary object which is likely to be a little slower.
For the q
, e
, f
, and ms
arrays: what you're doing now is absolutely fine if you're happy recompiling to change the sizes. If you think you might want to change the sizes at runtime then you should allocate them at runtime. You could use malloc
but I'd use:
cdef int[:,::1] e = np.zeros((501,21))
(the [:,::1]
tells Cython that the array is 2D and contiguous). Using numpy like this would be very slightly slower than malloc
, but it's also a lot easier and a lot less likely for you to get it wrong. If you do this then change their indexing to e[i,j]
as described above.
(It looks like they really should be of the size sequency_length
by num_machines
so a runtime size might be a good idea)
Upvotes: 1