Jmc
Jmc

Reputation: 830

Calculate an incremental mean using python pandas

I'd like to generate a series that's the incremental mean of a timeseries. Meaning that, starting from the first date (index 0), the mean stored in row x is the average of values [0:x]

data
index   value   mean          formula
0       4
1       5
2       6
3       7       5.5           average(0-3)
4       4       5.2           average(0-4)
5       5       5.166666667   average(0-5)
6       6       5.285714286   average(0-6)
7       7       5.5           average(0-7)

I'm hoping there's a way to do this without looping to take advantage of pandas.

Upvotes: 35

Views: 31754

Answers (3)

jpobst
jpobst

Reputation: 3711

Here's an update for newer versions of Pandas (starting with 0.18.0)

df['value'].expanding().mean()

or

s.expanding().mean()

Upvotes: 67

patricksurry
patricksurry

Reputation: 5878

Another approach is to use cumsum(), and divide by the cumulative number of items, for example:

In [1]:
    s = pd.Series([4, 5, 6, 7, 4, 5, 6, 7])
    s.cumsum() / pd.Series(np.arange(1, len(s)+1), s.index)

Out[1]:
0    4.000000
1    4.500000
2    5.000000
3    5.500000
4    5.200000
5    5.166667
6    5.285714
7    5.500000
dtype: float64

Upvotes: 12

Andy Hayden
Andy Hayden

Reputation: 375735

As @TomAugspurger points out, you can use expanding_mean:

In [11]: s = pd.Series([4, 5, 6, 7, 4, 5, 6, 7])

In [12]: pd.expanding_mean(s, 4)
Out[12]: 
0         NaN
1         NaN
2         NaN
3    5.500000
4    5.200000
5    5.166667
6    5.285714
7    5.500000
dtype: float64

Upvotes: 17

Related Questions