Don Coder
Don Coder

Reputation: 556

Making calculation with all previous values in pandas

I have a monthly pricing table. I would like to make sum calculations each table. For example i would like to get average, std, sum and etc. until that row from the very beginning. But when i do calculation, pandas calculate whole table. I doubt i am doing something wrong here. I believe that i am calculating whole table instead of calculating from beginning till that row. Here is what i did so far and sample of my table:

db['TY_MAX'] = db['Annual'] * np.mean(db['TY'][:-1])
db['TY_MIN'] = db['Annual'] * np.sum(db['TY'][:-1])
db['TY_MEAN'] = (db['TY_MAX'] + db['TY_MIN']) / 2

      Date      Annual  Monthly TY        TA
0   2005-01-01  1.0924  1.0055  0.994965  0.994729
1   2005-02-01  1.0869  1.0002  0.993100  1.002400
2   2005-03-01  1.0794  1.0026  1.002223  1.004488
3   2005-04-01  1.0818  1.0071  1.004807  1.002085
4   2005-05-01  1.0870  1.0092  1.002300  0.991875
5   2005-06-01  1.0895  1.0010  0.989628  0.993307
6   2005-07-01  1.0782  0.9943  1.000835  1.014281
7   2005-08-01  1.0791  1.0085  1.000741  1.001686
8   2005-09-01  1.0799  1.0102  0.995648  1.007622
9   2005-10-01  1.0752  1.0179  1.000837  0.996169
10  2005-11-01  1.0761  1.0140  1.001022  0.990335
11  2005-12-01  1.0772  1.0042  1.001949  1.003286
12  2006-01-01  1.0793  1.0075  1.002038  0.994739
13  2006-02-01  1.0815  1.0022  1.000092  1.000499
14  2006-03-01  1.0816  1.0027  1.006195  1.010671
15  2006-04-01  1.0883  1.0134  1.009464  1.005329

Upvotes: 1

Views: 65

Answers (1)

jezrael
jezrael

Reputation: 862611

It seems need rolling:

db['TY_sum'] =  db['TY'].rolling(len(db), min_periods=1).sum()
print (db)
          Date  Annual  Monthly        TY        TA     TY_sum
0   2005-01-01  1.0924   1.0055  0.994965  0.994729   0.994965
1   2005-02-01  1.0869   1.0002  0.993100  1.002400   1.988065
2   2005-03-01  1.0794   1.0026  1.002223  1.004488   2.990288
3   2005-04-01  1.0818   1.0071  1.004807  1.002085   3.995095
4   2005-05-01  1.0870   1.0092  1.002300  0.991875   4.997395
5   2005-06-01  1.0895   1.0010  0.989628  0.993307   5.987023
6   2005-07-01  1.0782   0.9943  1.000835  1.014281   6.987858
7   2005-08-01  1.0791   1.0085  1.000741  1.001686   7.988599
8   2005-09-01  1.0799   1.0102  0.995648  1.007622   8.984247
9   2005-10-01  1.0752   1.0179  1.000837  0.996169   9.985084
10  2005-11-01  1.0761   1.0140  1.001022  0.990335  10.986106
11  2005-12-01  1.0772   1.0042  1.001949  1.003286  11.988055
12  2006-01-01  1.0793   1.0075  1.002038  0.994739  12.990093
13  2006-02-01  1.0815   1.0022  1.000092  1.000499  13.990185
14  2006-03-01  1.0816   1.0027  1.006195  1.010671  14.996380
15  2006-04-01  1.0883   1.0134  1.009464  1.005329  16.005844

Thanks @ Edward Khachatryan for better answer with expanding:

db['TY_sum'] =  db['TY'].expanding(min_periods=1).sum()
print (db)
          Date  Annual  Monthly        TY        TA     TY_sum
0   2005-01-01  1.0924   1.0055  0.994965  0.994729   0.994965
1   2005-02-01  1.0869   1.0002  0.993100  1.002400   1.988065
2   2005-03-01  1.0794   1.0026  1.002223  1.004488   2.990288
3   2005-04-01  1.0818   1.0071  1.004807  1.002085   3.995095
4   2005-05-01  1.0870   1.0092  1.002300  0.991875   4.997395
5   2005-06-01  1.0895   1.0010  0.989628  0.993307   5.987023
6   2005-07-01  1.0782   0.9943  1.000835  1.014281   6.987858
7   2005-08-01  1.0791   1.0085  1.000741  1.001686   7.988599
8   2005-09-01  1.0799   1.0102  0.995648  1.007622   8.984247
9   2005-10-01  1.0752   1.0179  1.000837  0.996169   9.985084
10  2005-11-01  1.0761   1.0140  1.001022  0.990335  10.986106
11  2005-12-01  1.0772   1.0042  1.001949  1.003286  11.988055
12  2006-01-01  1.0793   1.0075  1.002038  0.994739  12.990093
13  2006-02-01  1.0815   1.0022  1.000092  1.000499  13.990185
14  2006-03-01  1.0816   1.0027  1.006195  1.010671  14.996380
15  2006-04-01  1.0883   1.0134  1.009464  1.005329  16.005844

Upvotes: 1

Related Questions