Reputation: 1777
I have many arrays of different length and what I want to do is to have for those arrays a fixed length, let's say 100 samples. These arrays contain time series and I do not want to lose the shape of those series while reducing the size of the array. What I think I need here is an undersampling algorithm. Is there an easy way to reduce the number of samples in an array doing like an average on some of those values?
Thanks
Upvotes: 1
Views: 2020
Reputation: 3507
Here's a shorter version of Nick Fellingham's answer.
from math import floor
def sample(input,count):
ss=float(len(input))/count
return [ input[int(floor(i*ss))] for i in range(count) ]
Upvotes: 1
Reputation: 384
Heres a little script to do it without numpy. Maintains shape even if length required is larger than the length of the array.
from math import floor
def sample(input, count):
output = []
sample_size = float(len(input)) / count
for i in range(count):
output.append(input[int(floor(i * sample_size))])
return output
Upvotes: 2
Reputation: 830
if you use a slice with generated random indices, and you keep your original array (or only the shape of it to reduce memory usage):
import numpy as np
input_data = somearray
shape = input_data.shape
n_samples= 100
inds = np.random.randint(0,shape[0], size=n_samples)
sub_samples = input_data[inds]
Upvotes: 2