Reputation: 1010
I have the following dataframe and would like to get the rolling cumulative return over the last lets say for this example 2 periods grouped by an identifier. For my actual case I need a longer period, but my problem is more with the groupby:
id return
2012 1 0.5
2012 2 0.2
2013 1 0.1
2013 2 0.3
The result should look like this:
id return cumreturn
2012 1 0.5 0.5
2012 2 0.2 0.2
2013 1 0.1 0.65
2013 2 0.3 0.56
It is import that the period is rolling. I have the following formula so far:
df["cumreturn"] = df.groupby("id")["return"].fillna(0).pd.rolling_apply(df,5,lambda x: np.prod(1+x)-1)
However, I get the following error: AttributeError: 'Series' object has no attribute 'pd'. I know how to get the rolling cumulative return. However, I just cant figure out how to combine it with groupby.
Upvotes: 1
Views: 1548
Reputation: 153460
Let's try this:
df_out = (df.set_index('id', append=True)
.assign(cumreturn=df.groupby('id')['return'].rolling(2,min_periods=1)
.apply(lambda x: np.prod(1+x)-1)
.swaplevel(0,1)).reset_index(1))
Output:
id return cumreturn
2012 1 0.5 0.50
2012 2 0.2 0.20
2013 1 0.1 0.65
2013 2 0.3 0.56
Upvotes: 3