Rohit
Rohit

Reputation: 6020

weighted average based on a variable window in pandas

I would like to take a weighted average of "cycle" based on a "day" as window. The window is not always the same. How do I compute weighted average in pandas?

In [3]: data = {'cycle':[34.1, 41, 49.0, 53.9, 35.8, 49.3, 38.6, 51.2, 44.8],
'day':[6,6,6,13,13,20,20,20,20]}


In [4]: df = pd.DataFrame(data, index=np.arange(9), columns = ['cycle', 'day'])

In [5]: df
Out[5]:
   cycle  day
0   34.1    6
1   41.0    6
2   49.0    6
3   53.9   13
4   35.8   13
5   49.3   20
6   38.6   20
7   51.2   20
8   44.8   20

I would expect three values (if I have done this correctly):

34.1 * 1/3 + 41 * 1/3 + 49 * 1/3 = 41.36

cycle      day
41.36       6
 6.90      13
45.90      20

Upvotes: 0

Views: 72

Answers (2)

Alexander
Alexander

Reputation: 109520

Group on day, and then apply a lambda function that calculates the sum of the group and divides it by then number of non-null values within the group.

>>> df.groupby('day').cycle.apply(lambda group: group.sum() / group.count())
day
6     41.366667
13    44.850000
20    45.975000
Name: cycle, dtype: float64

Although you say weighted average, I don't believe there are any weights involved. It appears as a simple average of the cycle value for a particular day. In fact, a simple mean should suffice.

Also, I believe the value for day 13 should be calculated as 53.9 * 1/2 + 35.8 * 1/2 which yields 44.85. Same approach for day 20.

Upvotes: 0

SQLnoob
SQLnoob

Reputation: 253

If I'm understanding correctly, I think you just want:

df.groupby(['day']).mean()

Upvotes: 1

Related Questions