Reputation: 185
I have a dataframe with an observation number, and id, and a number
Obs# Id Value
--------------------
1 1 5.643
2 1 7.345
3 2 0.567
4 2 1.456
I want to calculate a new column that is the mean of the previous values of a specific id
I am trying to use something like this but it only acquires the previous value:
df.groupby('Id')['Value'].apply(lambda x: x.shift(1) ...
My question is how do I acquire the range of previous values filtered by the Id so I can calculate the mean ?
So the new column based on this example should be
5.643
6.494
0.567
1.0115
Upvotes: 2
Views: 705
Reputation: 7296
You can also do it like:
df = pd.DataFrame({'Obs':[1,2,3,4],'Id':[1,1,2,2],'Value':[5.643,7.345, 0.567,1.456]})
df.groupby('Id')['Value'].apply(lambda x: x.cumsum()/np.arange(1, len(x)+1))
It gives output as :
5.643
6.494
0.567
1.0115
Upvotes: 1
Reputation: 51185
It seems that you want expanding
, then mean
df.groupby('Id').Value.expanding().mean()
Id
1.0 1 5.6430
2 6.4940
2.0 3 0.5670
4 1.0115
Name: Value, dtype: float64
Upvotes: 8