Surabhi Amit Chembra
Surabhi Amit Chembra

Reputation: 561

Improve the execution speed using Numpy

I have the following for loop and want to use numpy vectorizing, boolean, mask arrays etc to improve the execution speed. arr is a numpy array

def translate(arr, x=0):
    arr1 = arr.copy()
    for i in range(arr1.shape[0]):
        for j in range(arr1.shape[1]):
            if i + x < arr1.shape[0] and i + x > 0:
                arr1[i,j] = arr[i+x,j]
            if i + x < 0 or i + x > arr1.shape[0]:
                arr1[i,j] = 255
    return arr1

Any help or suggestions will be much appreciated.

Edit: instead of 255, when (i+x) <0, it should be arr[i,j]

def translate(arr, x=0):
    arr1 = arr.copy()
    for i in range(arr1.shape[0]):
        for j in range(arr1.shape[1]):
            if i + x < arr1.shape[0] and i + x > 0:
                arr1[i,j] = arr[i+x,j]
            if i + x < 0 or i + x > arr1.shape[0]:
                arr1[i,j] = arr[i,j]
    return arr1

Upvotes: 1

Views: 119

Answers (4)

max9111
max9111

Reputation: 6482

I have optimized your solution a bit (better cache utilization leading to about 30% speedup). But if you leave it as it is, it will be also OK. The only thing you have to do on numpy code with excessive looping is compiling it. This can be done using Numba or with a bit more work with Cython.

The compiled version should be about 3000 times faster than your solution. The parallelization isn't really necessary here (only about 20% speedup) due to running in a memory bottleneck.

Example

import numba as nb
import numpy as np


@nb.njit(fastmath=True,parallel=True)
def translate(arr, x):
    arr1 = np.empty(arr.shape,dtype=arr.dtype)
    for i in nb.prange(arr1.shape[0]):
        for j in range(arr1.shape[1]):
            arr1[i,j]=arr[i,j]
            if i + x < arr1.shape[0] and i + x > 0:
                arr1[i,j] = arr[i+x,j]
            if i + x < 0 or i + x > arr1.shape[0]:
                arr1[i,j] = arr[i,j]
    return arr1

Upvotes: 1

Surabhi Amit Chembra
Surabhi Amit Chembra

Reputation: 561

Thanks for the answers. Tried using numpy.roll(arr, shift_amount). It also works, but elements that roll beyond the last position are re-introduced at the first. So, when (i+x) is less than shape[0], it can be used.

if (i+x) < arr1.shape[0]:
    arr2 = np.roll(arr1,x)

Upvotes: 0

R.yan
R.yan

Reputation: 2372

Since @Paul Panzer solution didn't give out the same output as your function, I revised on his work by trying boolean array. Hopefully its what you want.

Code:

def my_translate(arr, x=0):
    arr = arr.copy()
    if x == 0:
        return arr.copy()
    elif x > 0:
        replacement1 = np.zeros(arr.shape)
        replacement1[:-x] = arr[x:]
    else:
        replacement1 = np.zeros(arr.shape)
        replacement1[-x:] = arr[:x]
    replacement2 = np.zeros(arr.shape)+255 # Array filled with 255 for second logic
    l = [np.repeat(i,arr.shape[1]) for i in range(arr.shape[0])]
    firstlooplogic1 = np.asarray(l)+x < arr.shape[0]
    firstlooplogic2 = np.asarray(l)+x > 0
    secondlooplogic1 = np.asarray(l)+x > arr.shape[0]
    secondlooplogic2 = np.asarray(l)+x < 0

    part1logic = np.logical_and(firstlooplogic1,firstlooplogic2)
    part2logic = np.logical_or(secondlooplogic1,secondlooplogic2)

    part1 = part1logic*replacement1
    part2 = part2logic*replacement2
    part3 = ((part1 == 0) * (part2 == 0)) * arr       
    return (part1 + part2 +part3).astype(arr.dtype)

Result:

arr = 3 - np.maximum.outer(*2*(np.abs(np.arange(-3, 4)),))
output1 = my_translate(arr,-2) #output from my function
output2 = translate(arr,-2)    #output from your function above
np.array_equal(output1,output2)
>Out[10]: True

Basically, I just break down your nested for-loop into boolean array and do the operation.

Upvotes: 1

Paul Panzer
Paul Panzer

Reputation: 53029

Here is something that does what I think you want to do (shift the original array up or down and fill the empty space with 255)

>>> def translate_vectorized(arr, x=0):
...     if x == 0:
...         return arr.copy()
...     out = np.full(arr.shape, 255)
...     if x < 0:
...         out[-x:] = arr[:x]
...     else:
...         out[:-x] = arr[x:]
...     return out

Demo

>>> arr = 3 - np.maximum.outer(*2*(np.abs(np.arange(-3, 4)),))
>>> arr
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0],
       [0, 1, 2, 2, 2, 1, 0],
       [0, 1, 2, 3, 2, 1, 0],
       [0, 1, 2, 2, 2, 1, 0],
       [0, 1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0, 0]])
>>> 
>>> translate_vectorized(arr, -2)
array([[255, 255, 255, 255, 255, 255, 255],
       [255, 255, 255, 255, 255, 255, 255],
       [  0,   0,   0,   0,   0,   0,   0],
       [  0,   1,   1,   1,   1,   1,   0],
       [  0,   1,   2,   2,   2,   1,   0],
       [  0,   1,   2,   3,   2,   1,   0],
       [  0,   1,   2,   2,   2,   1,   0]])
>>> translate_vectorized(arr, 1)
array([[  0,   1,   1,   1,   1,   1,   0],
       [  0,   1,   2,   2,   2,   1,   0],
       [  0,   1,   2,   3,   2,   1,   0],
       [  0,   1,   2,   2,   2,   1,   0],
       [  0,   1,   1,   1,   1,   1,   0],
       [  0,   0,   0,   0,   0,   0,   0],
       [255, 255, 255, 255, 255, 255, 255]])

Upvotes: 1

Related Questions