Reputation: 621
I have the following situation:
A numpy array
x = np.array([12,3,34,5...,])
where every entry corresponds to a simulation result (time-step 15min).
Now I need the mean hourly value (mean value of first 4 elements, then next 4, etc.) stored in a new numpy array. Is there a very simple method to accomplish this?
Upvotes: 2
Views: 591
Reputation: 589
Here is another solution:
your input:
In [11]: x = np.array([12, 3, 34, 5, 1, 2, 3])
taking every 4 elements in b
In [12]: b = [x[n:n+4] for n in range(0, len(x), 4)]
create new empty list to append results
In [13]: l = []
In [14]: for i in b:
....: l.append(np.mean(i))
....:
In [15]: l
Out[15]: [13.5, 2.0]
Upvotes: 1
Reputation: 880389
To handle arrays whose size may not be a multiple of 4,
copy x
into a new array, tmp
, whose size is a multiple of 4:
tmp = np.full((((x.size-1) // 4)+1)*4, dtype=float, fill_value=np.nan)
tmp[:x.size] = x
Empty values are represented by nan
. Then you can reshape and use nanmean
to compute the mean for each row. np.nanmean
is like np.mean
except that it ignores nan
s:
x = np.array([12,3,34,5,1])
tmp = np.full((((x.size-1) // 4)+1)*4, dtype=float, fill_value=np.nan)
tmp[:x.size] = x
tmp = tmp.reshape(-1, 4)
print(np.nanmean(tmp, axis=1))
prints
[ 13.5 1. ]
If you have pandas installed, then you could build a timeseries and group by a time interval:
import numpy as np
import pandas as pd
x = np.array([12,3,34,5,1])
s = pd.Series(x, index=pd.date_range('2000-1-1', periods=x.size, freq='15T'))
result = s.groupby(pd.TimeGrouper('1H')).mean()
print(result)
yields
2000-01-01 00:00:00 13.5
2000-01-01 01:00:00 1.0
Freq: H, dtype: float64
Upvotes: 2
Reputation: 2931
N = 4
mod_ = x.size % N
x1 = np.pad(x.astype(float), (0, (mod_ > 0) * (N - mod_)), 'constant', constant_values=(np.nan,))
x2 = np.reshape(x1, (int(x1.size/4), 4))
x3 = np.nanmean(x2, axis=1)
print(x3)
Upvotes: 1