Reputation: 6020
I would like to take a weighted average of "cycle" based on a "day" as window. The window is not always the same. How do I compute weighted average in pandas?
In [3]: data = {'cycle':[34.1, 41, 49.0, 53.9, 35.8, 49.3, 38.6, 51.2, 44.8],
'day':[6,6,6,13,13,20,20,20,20]}
In [4]: df = pd.DataFrame(data, index=np.arange(9), columns = ['cycle', 'day'])
In [5]: df
Out[5]:
cycle day
0 34.1 6
1 41.0 6
2 49.0 6
3 53.9 13
4 35.8 13
5 49.3 20
6 38.6 20
7 51.2 20
8 44.8 20
I would expect three values (if I have done this correctly):
34.1 * 1/3 + 41 * 1/3 + 49 * 1/3 = 41.36
cycle day
41.36 6
6.90 13
45.90 20
Upvotes: 0
Views: 72
Reputation: 109520
Group on day, and then apply a lambda function that calculates the sum of the group and divides it by then number of non-null values within the group.
>>> df.groupby('day').cycle.apply(lambda group: group.sum() / group.count())
day
6 41.366667
13 44.850000
20 45.975000
Name: cycle, dtype: float64
Although you say weighted average, I don't believe there are any weights involved. It appears as a simple average of the cycle
value for a particular day. In fact, a simple mean should suffice.
Also, I believe the value for day 13 should be calculated as 53.9 * 1/2 + 35.8 * 1/2
which yields 44.85. Same approach for day 20.
Upvotes: 0
Reputation: 253
If I'm understanding correctly, I think you just want:
df.groupby(['day']).mean()
Upvotes: 1