How can I compute the cumulative weighted average in new column?

Question

Read all related pages on google and stackoverflow, and I still can't find the solution..

Given this df fragment:

        key_br_acc_posid  lot_in         price
ix                                            
1      1_885020_76141036    0.03       1.30004
2      1_885020_76236801    0.02       1.15297
5      1_885020_76502318    0.50    2752.08000
8      1_885020_76502318    4.50    2753.93000
9      1_885020_76502318    0.50    2753.93000
                 ...     ...           ...
1042  1_896967_123068980    0.01       1.17657
1044  1_896967_110335293    0.01      28.07100
1047  1_896967_110335293    0.01      24.14000
1053  1_896967_146913299   25.00      38.55000
1054  1_896967_147039856    2.00  121450.00000

How can I create a new column w_avg_price computing the moving weighted average price by key_br_acc_posid? The lot_in is the weight and the price is the value.

I tried many approaches with groupby() + np.average() buy I have to avoid the data aggregation. I need this value in each row.

piterbarg · Accepted Answer

groupby and then perform the calculation for each group using cumsum()s:

(df.groupby('key_br_acc_posid', as_index = False)
    .apply(lambda g: g.assign(w_avg_price = (g['lot_in']*g['price']).cumsum()/g['lot_in'].cumsum()))    
    .reset_index(level = 0, drop = True)
)

result:


        key_br_acc_posid    lot_in         price    w_avg_price
----  ------------------  --------  ------------  -------------
   1   1_885020_76141036      0.03       1.30004        1.30004
   2   1_885020_76236801      0.02       1.15297        1.15297
   5   1_885020_76502318      0.5     2752.08        2752.08
   8   1_885020_76502318      4.5     2753.93        2753.74
   9   1_885020_76502318      0.5     2753.93        2753.76
1044  1_896967_110335293      0.01      28.071         28.071
1047  1_896967_110335293      0.01      24.14          26.1055
1042  1_896967_123068980      0.01       1.17657        1.17657
1053  1_896967_146913299     25         38.55          38.55
1054  1_896967_147039856      2     121450         121450

How can I compute the cumulative weighted average in new column?

Answers (2)

Related Questions