Reputation: 187
I have a question concerning the speed of loops in Python. I created the following loops to fill values in my array, but it is very slow. Is there a way to make it process faster ?
winW = 1
winH = 200
runlength = np.zeros(shape=(img.shape[0], img.shape[1]))
for y in range(0, img.shape[0] - winH, 1):
for x in range(0, img.shape[1] - winW, 1):
runlength[y, x] += np.sum(img[y:y + winH, x:x + winW]) / (winH * winW)
runlength[y + winH, x] += np.sum(img[y:y + winH, x:x + winW]) / (winH * winW)
Thanks for your help
Edit : I precise that I can only use numpy but not scipy
Upvotes: 0
Views: 179
Reputation: 14316
Let me describe how to speed up the first operation in the for loop, given by
runlength[y, x] += np.sum(img[y:y + winH, x:x + winW]) / (winH * winW)
Basically, you are moving a rectangle of width winW
and height winH
over the image. You start with the upper-left corner of the rectangle at point (0,0) of the image, then you sum all values in the image which lie below this rectangle and divide them by the total number of points. The output at position (0,0) is that number. Then you shift the rectangle one to the right and repeat the procedure until you are at the right end of the image. You move one row down and repeat.
In image processing terms: you apply a spatial filter mask to the image. The filter is an average filter of width winW
and height winH
.
To implement this efficiently, you can use the scipy.ndimage.correlate
function. The input
is your image, the weights
contains the weight by which element below the rectangle is multiplied. In this case that is an array with dimensions (winH, winW)
where every element contains the number 1 / (winH * winW)
. Thus every point of the image which lies below the rectangle is multiplied by 1 / (winH * winW)
, and then everything is summed.
To match your algorithm exactly, we need to set the origin
to (-np.floor(winH/2), -np.floor(winW/2))
to specify that the mean of the rectangle is placed at the location upper right corner of the rectangle in the output.
Finally, to match your algorithm exactly, we have to set all points below (img.shape[0] - winH)
or right of (img.shape[1] - winW)
to zero. The for-loop can thus be replaced with
runlength_corr = correlate(input=img,
weights=np.ones((winH, winW)) / (winW * winH),
origin=(-np.floor(winH/2), -np.floor(winW/2)))
runlength_corr[(img.shape[0] - winH):, :] = 0
runlength_corr[:, (img.shape[1] - winW):] = 0
I compared the run time of the nested for-loops and the correlate
method on a test image of size 512-by-512:
For-loops: Elapsed time: 0.665 sec
Correlate: Elapsed time: 0.085 sec
So this gives a nice speed-up of factor 8. The sum of absolute differences over the entire output is as low as 7.04e-09
, so the outputs of both methods are essentially the same.
Upvotes: 2
Reputation: 29071
For starters, you seem to be calculating the same quantity twice, inside you loop. That alone could half your running time.
Second, if winW
is always 1
, then np.sum(img[y:y + winH, x:x + winW])
is just
np.sum(img[y:y + winH, x])
. That should speed it up a bit.
What remains is how you can speed up np.sum(img[y:y + winH, x])
. You can start with calculating
sum0 = np.sum(img[0: 0 + winH, x])
Now, note that the quantity
sum1 = np.sum(img[1: 1 + winH, x])
differs from the previous one by two pixels only, so, it is equal to sum0 - img[0, x] + img[1 + winH, x]
. For the next y
sum2 = sum1 - img[1, x] + img[2 + winH, x]`
and so on
Upvotes: 0