Reputation: 463
import pandas as pd
df = pd.DataFrame({'A':[3,5,3,4,2,3,2,3,4,3,2,2,2,3],
'B':[10,20,30,40,20,30,40,10,20,30,15,60,20,15]})
A B
0 3 10
1 5 20
2 3 30
3 4 40
4 2 20
5 3 30
6 2 40
7 3 10
8 4 20
9 3 30
10 2 15
11 2 60
12 2 20
13 3 15
I'd like to append a C
column, containing rolling average of B
(rolling period = A
).
For example, the C
value at row index(2)
should be df.B.rolling(3).mean() = mean(10,20,30)
, and the C
value at row index(4)
should be df.B.rolling(2).mean() = mean(40,20)
.
Upvotes: 3
Views: 265
Reputation: 294248
probably stupid slow... but this get's it done
def crazy_apply(row):
p = df.index.get_loc(row.name)
a = row.A
return df.B.iloc[p-a+1:p+1].mean()
df.apply(crazy_apply, 1)
0 NaN
1 NaN
2 20.000000
3 25.000000
4 30.000000
5 30.000000
6 35.000000
7 26.666667
8 25.000000
9 20.000000
10 22.500000
11 37.500000
12 40.000000
13 31.666667
dtype: float64
explanation
apply
iterates through each column or each row. We iterate through each row because we used the parameter axis=1
(see 1
as the second argument in the call to apply
). So every iteration of apply
passes the a pandas series object that represents the current row. the current index value is in the name
attribute of the row. The index of the row object is the same as the columns of df
.
So, df.index.get_loc(row.name)
finds the ordinal position of the current index value held in row.name
. row.A
is the column A
for that row.
Upvotes: 2