Improving runtime of weighted moving average filter function?

Question

I have a weighted moving average function which smooths a curve by averaging 3*width values to the left and to the right of each point using a gaussian weighting mechanism. I am only worried about smoothing a region bounded by [start, end]. The following code works, but the problem is runtime with large arrays.

import numpy as np
def weighted_moving_average(x, y, start, end, width = 3):
    def gaussian(x, a, m, s):
        return a*exp(-(x-m)**2/(2*s**2))
    cut = (x>=start-3*width)*(x<=end+3*width)
    x, y = x[cut], y[cut]
    x_avg = x[(x>=start)*(x<=end)]
    y_avg = np.zeros(len(x_avg))
    bin_vals = np.arange(-3*width,3*width+1)
    weights = gaussian(bin_vals, 1, 0, width)
    for i in range(len(x_avg)):
        y_vals = y[i:i+6*width+1]
        y_avg[i] = np.average(y_vals, weights = weights)
    return x_avg, y_avg

From my understanding, it is generally inefficient to loop through a NumPy array. I was wondering if anyone had an idea to replace the for loop with something more runtime efficient.

Thanks

Divakar · Accepted Answer

That slicing and summing/averaging on a weighted window basically corresponds to 1D convolution with the kernel being flipped. Now, for 1D convolution, NumPy has a very efficient implementation in np.convolve and that could be used to get rid of the loop and give us y_avg. Thus, we would have a vectorized implementation like so -

y_sums = np.convolve(y,weights[::-1],'valid')
y_avg = np.true_divide(y_sums,weights.sum())

Improving runtime of weighted moving average filter function?

Answers (2)

Related Questions