Reputation: 11
I am having troubles with one calculation that I know is probably simple but is giving me a headache.
I have a pandas dataframe and I want to generate a column that contains the multiplication one value in the same row and one in row - 1, furthermore, I want to sum with one value in row - 1 and put it in the actual row. So far I have this:
AAPL GLD AAPL_cumu_rtn GLD_cumu_rtn AAPL_Dly_rtn \
Date
2011-01-03 329.570000 138.000000 100.000000 100.000000 NaN
2011-01-04 331.290012 134.750000 100.521896 97.644928 0.005219
2011-01-05 334.000008 134.369995 101.344178 97.369562 0.008180
2011-01-06 333.729988 133.830002 101.262247 96.978262 -0.000808
2011-01-07 336.120003 133.580002 101.987439 96.797103 0.007162
GLD_Dly_rtn AAPL_invest GLD_invest
2011-01-03 NaN 650000.0 350000.0
2011-01-04 -0.023551 0.0 0.0
2011-01-05 -0.002820 0.0 0.0
2011-01-06 -0.004019 0.0 0.0
2011-01-07 -0.001868 0.0 0.0
I want to finish filling APPL_invest and GLD_invest columns that is multiply AAPL_Dly_rtn by AAPL_invest in the previous row and sum with itself and so on...
AAPL GLD AAPL_cumu_rtn GLD_cumu_rtn AAPL_Dly_rtn \
Date
2011-01-03 329.570000 138.000000 100.000000 100.000000 NaN
2011-01-04 331.290012 134.750000 100.521896 97.644928 0.005219
2011-01-05 334.000008 134.369995 101.344178 97.369562 0.008180
2011-01-06 333.729988 133.830002 101.262247 96.978262 -0.000808
2011-01-07 336.120003 133.580002 101.987439 96.797103 0.007162
GLD_Dly_rtn AAPL_invest GLD_invest
2011-01-03 NaN 650000.0 350000.0
2011-01-04 -0.023551 653392.3 341757.1
2011-01-05 -0.002820 658737.1 340793.3
2011-01-06 -0.004019 0.0 0.0
2011-01-07 -0.001868 0.0 0.0
If anyone has any idea on how can I do this without using a for loop I will apreciate!
Thank you very much!!!!
Upvotes: 0
Views: 91
Reputation: 173
I think this can be done with .cumprod
.
For example if we take (df['GLD_Dly_rtn'] + 1).fillna(1)
and then use .cumprod()
on it. We normalise the return and can then simply multiply by the initial investment, so for gold this would be.
df['GLD_total'] = (df['GLD_Dly_rtn'] + 1).fillna(1).cumprod() * 350000
The fillna part here is only there to fill the initial NaN.
Which results in:
0 350000.000000
1 341757.150000
2 340793.394837
3 339423.746183
4 338789.702625
There may be a neater or more general way to so this, but I can't think of it at the moment
edit:
I've just realised that you already have the cumulative return column in the dataframe so you could simply use:
df['GLD_total'] = (df['GLD_cumu_rtn'] / 100) * 350000
Upvotes: 1