Cut peaks and troughs

Question

Here is an algorithm I would like to implement using numpy:

For a given 1D array, calculate the maximum and the minimum over a sliding window. Create a new array, with the first value equals to the first value in the given array. For each subsequent values, clip the previous value inserted in the new array between the min and the max from the sliding window.

As an example, let's take the array a=[3, 4, 5, 4, 3, 2, 3, 3] and a sliding window of size 3. We find for min and max:

min = [3, 4, 3, 2, 2, 2]
max = [5, 5, 5, 4, 3, 3]

Now our output array will start with the first element from a, so it's 3. And for the next value, I clip 3 (the last value inserted) between 4 and 5 (the min and max found at index 1). The result is 4. For the next value I clip 4 between 3 and 5. It's still 4. And so on. So we finally have:

output = [3, 4, 4, 4, 3, 3]

I cannot find a way to avoid using a python for loop in my code. Here is what I have for the moment:

def second_window(array, samples):
    sample_idx = samples - 1
    output = np.zeros_like(array[0:-sample_idx])
    start, stop = 0, len(array)
    last_value = array[0]
    # Sliding window is a deque of length 'samples'.
    sliding_window = deque(array[start : start+sample_idx], samples)
    for i in xrange( stop - start - sample_idx):
        # Get the next value in sliding window. After the first loop,
        # the left value gets discarded automatically.
        sliding_window.append(array[start + i + sample_idx])
        min_value, max_value = min(sliding_window), max(sliding_window)
        # Clip the last value between sliding window min and max
        last_value = min( max(last_value, min_value), max_value)
        output[start + i] = last_value
    return output

Would it be possible to achieve this result with only numpy?

Jaime · Accepted Answer

I don't think you can. You can sometime do this kind of iterative computation with unbuffered ufuncs, but this isn't the case. But let me ellaborate...

OK, first the windowing an min/max calculations can be done much faster:

>>> a = np.array([3, 4, 5, 4, 3, 2, 3, 3])
>>> len_a = len(a)
>>> win = 3
>>> win_a = as_strided(a, shape=(len_a-win+1, win), strides=a.strides*2)
>>> win_a
array([[3, 4, 5],
       [4, 5, 4],
       [5, 4, 3],
       [4, 3, 2],
       [3, 2, 3],
       [2, 3, 3]])
>>> min_ = np.min(win_a, axis=-1)
>>> max_ = np.max(win_a, axis=-1)

Now, lets create and fill up your output array:

>>> out = np.empty((len_a-win+1,), dtype=a.dtype)
>>> out[0] = a[0]

If np.clip where a ufunc, we could then try to do:

>>> np.clip(out[:-1], min_[1:], max_[1:], out=out[1:])
array([4, 3, 3, 3, 3])
>>> out
array([3, 4, 3, 3, 3, 3])

But this doesn't work, because np.clip is not a ufunc, and there seems to be some buffering involved.

And if you apply np.minimum and np.maximum separately, then it doesn't always work:

>>> np.minimum(out[:-1], max_[1:], out=out[1:])
array([3, 3, 3, 3, 3])
>>> np.maximum(out[1:], min_[1:], out=out[1:])
array([4, 3, 3, 3, 3])
>>> out
array([3, 4, 3, 3, 3, 3])

although for your particular case reversing the other does work:

>>> np.maximum(out[:-1], min_[1:], out=out[1:])
array([4, 4, 4, 4, 4])
>>> np.minimum(out[1:], max_[1:], out=out[1:])
array([4, 4, 4, 3, 3])
>>> out
array([3, 4, 4, 4, 3, 3])

Cut peaks and troughs

Answers (2)

Related Questions