Reputation: 561
I have the following for loop and want to use numpy vectorizing, boolean, mask arrays etc to improve the execution speed. arr is a numpy array
def translate(arr, x=0):
arr1 = arr.copy()
for i in range(arr1.shape[0]):
for j in range(arr1.shape[1]):
if i + x < arr1.shape[0] and i + x > 0:
arr1[i,j] = arr[i+x,j]
if i + x < 0 or i + x > arr1.shape[0]:
arr1[i,j] = 255
return arr1
Any help or suggestions will be much appreciated.
Edit: instead of 255, when (i+x) <0, it should be arr[i,j]
def translate(arr, x=0):
arr1 = arr.copy()
for i in range(arr1.shape[0]):
for j in range(arr1.shape[1]):
if i + x < arr1.shape[0] and i + x > 0:
arr1[i,j] = arr[i+x,j]
if i + x < 0 or i + x > arr1.shape[0]:
arr1[i,j] = arr[i,j]
return arr1
Upvotes: 1
Views: 119
Reputation: 6482
I have optimized your solution a bit (better cache utilization leading to about 30% speedup). But if you leave it as it is, it will be also OK. The only thing you have to do on numpy code with excessive looping is compiling it. This can be done using Numba or with a bit more work with Cython.
The compiled version should be about 3000 times faster than your solution. The parallelization isn't really necessary here (only about 20% speedup) due to running in a memory bottleneck.
Example
import numba as nb
import numpy as np
@nb.njit(fastmath=True,parallel=True)
def translate(arr, x):
arr1 = np.empty(arr.shape,dtype=arr.dtype)
for i in nb.prange(arr1.shape[0]):
for j in range(arr1.shape[1]):
arr1[i,j]=arr[i,j]
if i + x < arr1.shape[0] and i + x > 0:
arr1[i,j] = arr[i+x,j]
if i + x < 0 or i + x > arr1.shape[0]:
arr1[i,j] = arr[i,j]
return arr1
Upvotes: 1
Reputation: 561
Thanks for the answers. Tried using numpy.roll(arr, shift_amount). It also works, but elements that roll beyond the last position are re-introduced at the first. So, when (i+x) is less than shape[0], it can be used.
if (i+x) < arr1.shape[0]:
arr2 = np.roll(arr1,x)
Upvotes: 0
Reputation: 2372
Since @Paul Panzer solution didn't give out the same output as your function, I revised on his work by trying boolean array. Hopefully its what you want.
Code:
def my_translate(arr, x=0):
arr = arr.copy()
if x == 0:
return arr.copy()
elif x > 0:
replacement1 = np.zeros(arr.shape)
replacement1[:-x] = arr[x:]
else:
replacement1 = np.zeros(arr.shape)
replacement1[-x:] = arr[:x]
replacement2 = np.zeros(arr.shape)+255 # Array filled with 255 for second logic
l = [np.repeat(i,arr.shape[1]) for i in range(arr.shape[0])]
firstlooplogic1 = np.asarray(l)+x < arr.shape[0]
firstlooplogic2 = np.asarray(l)+x > 0
secondlooplogic1 = np.asarray(l)+x > arr.shape[0]
secondlooplogic2 = np.asarray(l)+x < 0
part1logic = np.logical_and(firstlooplogic1,firstlooplogic2)
part2logic = np.logical_or(secondlooplogic1,secondlooplogic2)
part1 = part1logic*replacement1
part2 = part2logic*replacement2
part3 = ((part1 == 0) * (part2 == 0)) * arr
return (part1 + part2 +part3).astype(arr.dtype)
Result:
arr = 3 - np.maximum.outer(*2*(np.abs(np.arange(-3, 4)),))
output1 = my_translate(arr,-2) #output from my function
output2 = translate(arr,-2) #output from your function above
np.array_equal(output1,output2)
>Out[10]: True
Basically, I just break down your nested for-loop into boolean array and do the operation.
Upvotes: 1
Reputation: 53029
Here is something that does what I think you want to do (shift the original array up or down and fill the empty space with 255
)
>>> def translate_vectorized(arr, x=0):
... if x == 0:
... return arr.copy()
... out = np.full(arr.shape, 255)
... if x < 0:
... out[-x:] = arr[:x]
... else:
... out[:-x] = arr[x:]
... return out
Demo
>>> arr = 3 - np.maximum.outer(*2*(np.abs(np.arange(-3, 4)),))
>>> arr
array([[0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 0],
[0, 1, 2, 2, 2, 1, 0],
[0, 1, 2, 3, 2, 1, 0],
[0, 1, 2, 2, 2, 1, 0],
[0, 1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0]])
>>>
>>> translate_vectorized(arr, -2)
array([[255, 255, 255, 255, 255, 255, 255],
[255, 255, 255, 255, 255, 255, 255],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 1, 1, 1, 1, 0],
[ 0, 1, 2, 2, 2, 1, 0],
[ 0, 1, 2, 3, 2, 1, 0],
[ 0, 1, 2, 2, 2, 1, 0]])
>>> translate_vectorized(arr, 1)
array([[ 0, 1, 1, 1, 1, 1, 0],
[ 0, 1, 2, 2, 2, 1, 0],
[ 0, 1, 2, 3, 2, 1, 0],
[ 0, 1, 2, 2, 2, 1, 0],
[ 0, 1, 1, 1, 1, 1, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[255, 255, 255, 255, 255, 255, 255]])
Upvotes: 1