Mints
Mints

Reputation: 113

An efficient way to iterate over a multidimensional array?

I'm trying to find a way to perform operations on each elements across multiple 2D-arrays without having to loop over them. Or at least, not needing two for loops. My code calculates the standard deviation of each pixel over a series of images (arrays). Now, the amount of images there are is not the problem, it is the size of the arrays, making the code take extremely slow. The following is a working example of what I have.

import numpy as np

# reshape(# of image (arrays),# of rows, # of cols) 
a = np.arange(32).reshape(2,4,4)

stddev_arr = np.array([])
for i in range(4):
    for j in range(4): 
        pixel = a[0:,i,j]
        stddev = np.std(pixel) 
        stddev_arr = np.append(stddev_arr, stddev)

My actual data is 2000x2000, making this code loop 4000000 times. Is there a better way to do this? Any advice is extremely appreciated.

Upvotes: 2

Views: 2213

Answers (2)

pho
pho

Reputation: 25489

You're already using numpy. numpy's std() function takes an axis argument that tells it what axis you want it to operate on (in this case the zeroth axis). Because this offloads the calculation to numpy's C-backend (and possibly using SIMD optimizations for your processor that vectorize a lot of operations), it's so much faster than iterating. Another time-consuming operation in your code is when you append to stddev_arr. Appending to numpy arrays is slow because the entire array is copied into new memory before the new element is added. Now you already know how big that array needs to be, so you might as well preallocate it.

a = np.arange(32).reshape(2, 4, 4)
stdev = np.std(a, axis=0)

This gives a 4x4 array

array([[8., 8., 8., 8.],
       [8., 8., 8., 8.],
       [8., 8., 8., 8.],
       [8., 8., 8., 8.]])

To flatten this into a 1D array, do flat_stdev = stdev.flatten().

Comparing the execution times:

# Using only numpy
def fun1(arr):
    return np.std(arr, axis=0).flatten()

# Your function
def fun2(arr):
    stddev_arr = np.array([])
    for i in range(arr.shape[1]):
        for j in range(arr.shape[2]): 
            pixel = arr[0:,i,j]
            stddev = np.std(pixel) 
            stddev_arr = np.append(stddev_arr, stddev)
    return stddev_arr


# Your function, but pre-allocating stddev_arr
def fun3(arr):
    stddev_arr = np.zeros((arr.shape[1] * arr.shape[2],))
    x = 0
    for i in range(arr.shape[1]):
        for j in range(arr.shape[2]): 
            pixel = arr[0:,i,j]
            stddev = np.std(pixel) 
            stddev_arr[x] = stddev
            x += 1
    return stddev_arr

First, let's make sure all these functions are equivalent:

a = np.random.random((3, 10, 10))
assert np.all(fun1(a) == fun2(a))
assert np.all(fun1(a) == fun3(a))

Yup, all give the same result. Now, let's try with a bigger array.

a = np.random.random((3, 100, 100))

x = timeit.timeit('fun1(a)', setup='from __main__ import fun1, a', number=10)
# x: 0.003302899989648722

y = timeit.timeit('fun2(a)', setup='from __main__ import fun2, a', number=10)
# y: 5.495519500007504

z = timeit.timeit('fun3(a)', setup='from __main__ import fun3, a', number=10)
# z: 3.6250679999939166

Wow! We get a ~1.5x speedup just by preallocating. Even more wow: using numpy's std() with the axis argument gives a > 1000x speedup, and this is just for the 100x100 array! With bigger arrays, you can expect to see even bigger speedup.

Upvotes: 4

Luke B
Luke B

Reputation: 1194

So based on what you have provided, you can reshape your array in another way to vectorize it to replace your two loops. Then you only have to use np.std once on the axis that you want.

a = np.arange(32).reshape(2, 4, 4)

a = a.reshape(2, -1).transpose()

stddev_arr = np.std(a, axis=1)

Upvotes: 1

Related Questions