Pandas cumsum after sorting on different column

Question

I have a dataframe df as following:

Col1    Col2    SortVal Qty    Col3
R1      Grp1    0.88    160.8  1
R2      Grp1    2.28    20.9   3
R1      Grp1    2.09    184.6  5  
R1      Grp1    2.44    55.2   7
R2      Grp1    0.61    80.4   9 
R1      Grp1    2.41    41.9   1
R2      Grp1    1.66    92.3   4
R2      Grp1    2.3     27.6   6

I need to do the following:

group by Col1 and Col2
Sort ascending in each group for SortVal
cummulative sum on Qty

I am trying the following:

df.sort_values('SortVal', ascending=True).groupby(['Col1', 'Col2']).cumsum()

But I am not getting the following desired solution:

Col1    Col2    SortVal Qty     cumS    Col3
R1      Grp1    0.88    160.8   160.8   1
R1      Grp1    2.09    184.6   345.4   5
R1      Grp1    2.41    41.9    387.3   1
R1      Grp1    2.44    55.2    442.5   7
R2      Grp1    0.61    80.4    80.4    9
R2      Grp1    1.66    92.3    172.7   4
R2      Grp1    2.28    20.9    193.6   3
R2      Grp1    2.3     27.6    221.2   6

cs95 · Accepted Answer

Sort in advance, then group and compute the cumsum. After that, sort again:

(df.assign(cumS=df.sort_values(['Col1', 'Col2', 'SortVal')
                  .groupby(['Col1', 'Col2'], sort=False)['Qty']
                  .cumsum())
   .sort_values(['Col1', 'Col2', 'SortVal']))     

  Col1  Col2  SortVal    Qty  Col3   cumS
0   R1  Grp1     0.88  160.8     1  160.8
2   R1  Grp1     2.09  184.6     5  345.4
5   R1  Grp1     2.41   41.9     1  387.3
3   R1  Grp1     2.44   55.2     7  442.5
4   R2  Grp1     0.61   80.4     9   80.4
6   R2  Grp1     1.66   92.3     4  172.7
1   R2  Grp1     2.28   20.9     3  193.6
7   R2  Grp1     2.30   27.6     6  221.2

Another method, similar to the first but reduces one sort step in exchange for an apply:

(df.sort_values(['Col1', 'Col2', 'SortVal']) 
   .groupby(['Col1', 'Col2'], sort=False) 
   .apply(lambda x: x.assign(cumS=x['Qty'].cumsum())))

  Col1  Col2  SortVal    Qty  Col3   cumS
0   R1  Grp1     0.88  160.8     1  160.8
2   R1  Grp1     2.09  184.6     5  345.4
5   R1  Grp1     2.41   41.9     1  387.3
3   R1  Grp1     2.44   55.2     7  442.5
4   R2  Grp1     0.61   80.4     9   80.4
6   R2  Grp1     1.66   92.3     4  172.7
1   R2  Grp1     2.28   20.9     3  193.6
7   R2  Grp1     2.30   27.6     6  221.2

Pandas cumsum after sorting on different column

Answers (2)

Related Questions