jurreaserna
jurreaserna

Reputation: 11

Operations between columns and rows simultaneously Pandas

I am having troubles with one calculation that I know is probably simple but is giving me a headache.

I have a pandas dataframe and I want to generate a column that contains the multiplication one value in the same row and one in row - 1, furthermore, I want to sum with one value in row - 1 and put it in the actual row. So far I have this:

                      AAPL         GLD  AAPL_cumu_rtn  GLD_cumu_rtn  AAPL_Dly_rtn  \
Date
2011-01-03  329.570000  138.000000     100.000000    100.000000           NaN
2011-01-04  331.290012  134.750000     100.521896     97.644928      0.005219
2011-01-05  334.000008  134.369995     101.344178     97.369562      0.008180
2011-01-06  333.729988  133.830002     101.262247     96.978262     -0.000808
2011-01-07  336.120003  133.580002     101.987439     96.797103      0.007162

            GLD_Dly_rtn  AAPL_invest  GLD_invest

2011-01-03          NaN     650000.0    350000.0
2011-01-04    -0.023551          0.0         0.0
2011-01-05    -0.002820          0.0         0.0
2011-01-06    -0.004019          0.0         0.0
2011-01-07    -0.001868          0.0         0.0

I want to finish filling APPL_invest and GLD_invest columns that is multiply AAPL_Dly_rtn by AAPL_invest in the previous row and sum with itself and so on...

                      AAPL         GLD  AAPL_cumu_rtn  GLD_cumu_rtn  AAPL_Dly_rtn  \
Date
2011-01-03  329.570000  138.000000     100.000000    100.000000           NaN
2011-01-04  331.290012  134.750000     100.521896     97.644928      0.005219
2011-01-05  334.000008  134.369995     101.344178     97.369562      0.008180
2011-01-06  333.729988  133.830002     101.262247     96.978262     -0.000808
2011-01-07  336.120003  133.580002     101.987439     96.797103      0.007162

            GLD_Dly_rtn  AAPL_invest  GLD_invest

2011-01-03          NaN     650000.0    350000.0
2011-01-04    -0.023551     653392.3    341757.1
2011-01-05    -0.002820     658737.1    340793.3
2011-01-06    -0.004019          0.0         0.0
2011-01-07    -0.001868          0.0         0.0

If anyone has any idea on how can I do this without using a for loop I will apreciate!

Thank you very much!!!!

Upvotes: 0

Views: 91

Answers (1)

jack-tee
jack-tee

Reputation: 173

I think this can be done with .cumprod.

For example if we take (df['GLD_Dly_rtn'] + 1).fillna(1) and then use .cumprod() on it. We normalise the return and can then simply multiply by the initial investment, so for gold this would be.

df['GLD_total'] = (df['GLD_Dly_rtn'] + 1).fillna(1).cumprod() * 350000

The fillna part here is only there to fill the initial NaN.

Which results in:

0    350000.000000
1    341757.150000
2    340793.394837
3    339423.746183
4    338789.702625

There may be a neater or more general way to so this, but I can't think of it at the moment

edit:

I've just realised that you already have the cumulative return column in the dataframe so you could simply use: df['GLD_total'] = (df['GLD_cumu_rtn'] / 100) * 350000

Upvotes: 1

Related Questions