Richard
Richard

Reputation: 154

How do I quickly decimate a numpy array?

I need a function that decimates, removes m in n of, a numpy array. For example to remove 1 in 2 or remove 2 in 3. So an array which is: [7, 4, 3, 5, 9, 2, 4, 1, 6, 8]

decimated by 1:2 would become: [7, 3, 9, 4, 6]

I wonder if it is possible to reshape the array from 1d array N long to one that is 2d and N/2, 2 long then drop the extra dimension?

Ideally, rather than just dump the decimated samples, I would like to find the maximum value across each set (in this example pair) of values. For example: [7, 5, 9, 4, 8]

Is there a way to find the maximum value across each set rather than just to drop it?

The added challenge is that the point here is to plot the values.

The decimation is required because plotting every value is taking too long meaning that I have to reduce the size of an array before plotting it but I need to do this quickly. So for or while loops would take too long.

Upvotes: 8

Views: 13609

Answers (3)

Alexander Rakhmaev
Alexander Rakhmaev

Reputation: 1055

It is worth being afraid of simply throwing out readings, because significant readings can be thrown out.

For the tasks that you described, it is worth using decimation.

Unfortunately it is not in numpy, but it is in scipy.

In the code below, I gave an example when discarding samples leads to an error.

enter image description here

As you can see, the original data (blue) has a peak. And manual thinning can just skip it (green). If you apply deciamation from the library, then it will be included in the result (orange).

from scipy import signal
import matplotlib.pyplot as plt
import numpy as np
downsampling_factor = 2

t = np.linspace(0, 1, 50)
y = list(np.random.randint(0,10,int(len(t)/2))) + [50] + list(np.random.randint(0,10,int(len(t)/2-1)))


ydem = signal.decimate(y, downsampling_factor)
t_new = np.linspace(0, 1, len(ydem))

manual_decimation = y[:-downsampling_factor:downsampling_factor]
t_manual_decimation = np.linspace(0, 1, len(manual_decimation))

plt.plot(t, y, '.-', t_new, ydem, 'o-', t_manual_decimation,  manual_decimation, 'x-')
plt.legend(['data', 'scipy decimate', 'manual decimate'], loc='best')
plt.show()

In general, this is not such a trivial task, please be careful.

UPD: note that the length of the vector must be greater than 27.

Upvotes: 5

Gerard Kruisheer
Gerard Kruisheer

Reputation: 71

A quick and dirty way is

k,N = 3,18
a = np.random.randint(0,10,N) #[9, 6, 6, 6, 8, 4, 1, 4, 8, 1, 2, 6, 1, 8, 9, 8, 2, 8]
a = a[:-k:k] #[9, 6, 1, 1, 1]

This should work regardless of k dividing into N or not.

Upvotes: 7

Paul Panzer
Paul Panzer

Reputation: 53029

to find the maximum:

1) k divides N:

k,N = 3,18
a = np.random.randint(0,10,N)
a
# array([0, 6, 6, 3, 7, 0, 9, 2, 3, 2, 5, 4, 2, 6, 9, 6, 3, 2])
a.reshape(-1,k).max(1)
# array([6, 7, 9, 5, 9, 6])

2) k does not divide N:

k,N = 4,21
a = np.random.randint(0,10,N)
a
# array([4, 4, 6, 0, 0, 1, 7, 8, 2, 3, 0, 5, 7, 1, 1, 5, 7, 8, 3, 1, 7])
np.maximum.reduceat(a, np.arange(0,N,k))
# array([6, 8, 5, 7, 8, 7])

2) should always work but I suspect 1) is faster where applicable

Upvotes: 3

Related Questions