Fast Way to Perform Array Computation in Python

Question

I have an image that I want to perform some calculations on. The image pixels will be represented as f(x, y) where x is the column number and y is the row number of each pixel. I want to perform a calculation using the following formula:

Here is the code that does the calculation:

import matplotlib.pyplot as plt
import numpy as np
import os.path
from PIL import Image

global image_width, image_height


# A. Blur Measurement
def  measure_blur(f):

    D_sub_h = [[0 for y in range(image_height)] for x in range(image_width)]

    for x in range(image_width):
        for y in range(image_height):
            if(y == 0):                
                f_x_yp1 = f[x][y+1]
                f_x_ym1 = 0 
            elif(y == (image_height -1)):

                f_x_yp1 = 0
                f_x_ym1 = f[x][y -1]
            else:                
                f_x_yp1 = f[x][y+1]
                f_x_ym1 = f[x][y -1]

            D_sub_h[x][y] = abs(f_x_yp1 - f_x_ym1)

    return D_sub_h

if __name__ == '__main__':

    image_counter = 1

    while True:

        if not os.path.isfile(str (image_counter) + '.jpg'):
            break

        image_path = str(image_counter) + '.jpg'
        image = Image.open(image_path )
        image_height, image_width = image.size

        print("Image Width : " + str(image_width))
        print("Image Height : " + str(image_height))

        f = np.array(image)
        D_sub_h = measure_blur(f)
        image_counter = image_counter + 1

The problem with this code is when the image size becomes large, such as (5000, 5000), it takes a very long time to complete. Is there any way or function I can use to make the execution time faster by not doing one by one or manual computation?

Mad Physicist · Accepted Answer

Since you specifically convert the input f to a numpy array, I am assuming you want to use numpy. In that case, the allocation of D_sub_h needs to change from a list to an array:

D_sub_h = np.empty_like(f)

If we assume that everything outside your array is zeros, then the first row and last row can be computed as the second and negative second-to-last rows, respectively:

D_sub_h[0, :] = f[1, :]
D_sub_h[-1, :] = -f[-2, :]

The remainder of the data is just the difference between the next and previous index at each location, which is idiomatically computed by shifting views: f[2:, :] - f[:-2, :]. This formulation creates a temporary array. You can avoid doing that by using np.subtract explicitly:

np.subtract(f[2:, :], f[:-2, :], out=D_sub_h[1:-1, :])

The entire thing takes four lines in this formulation, and is fully vectorized, which means that loops run quickly under the hood, without most of Python's overhead:

def measure_blur(f):
    D_sub_h = np.empty_like(f)
    D_sub_h[0, :] = f[1, :]
    D_sub_h[-1, :] = -f[-2, :]
    np.subtract(f[2:, :], f[:-2, :], out=D_sub_h[1:-1, :])
    return D_sub_h

Notice that I return the value instead of printing it. When you write functions, get in the habit of returning a value. Printing can be done later, and effectively discards the computation if it replaces a proper return.

The way shown above is fairly efficient with regards to time and space. If you want to write a one liner that uses a lot of temporary arrays, you can also do:

D_sub_h = np.concatenate((f[1, None], f[2:, :] - f[:-2, :], -f[-2, None]), axis=0)

Fast Way to Perform Array Computation in Python

Answers (1)

Related Questions