Pandas Dataframe cumsum function to "restart" when value in row changes

Question

df['total'] = (df.DR - df.CR).cumsum()

is giving:

...
    Name     DR       CR     total 
303   B3  46.80     0.00  46682.07                                                                                                         
304   B3  45.20     0.00  46727.27                                                                                                                                                                                                                
395  BS1   0.00    10.37  47905.31                                                                                                         
396  BS2   0.00    87.00  47818.31                                                                                                         
397    C   0.00   482.10  47336.21                                                                                                                                                                                                               
399    C  20.00     0.00  47356.21

However I would like the cumsum to "restart" whenever the "Name" column (B3,BS1,C) changes to a different value.

So the desired result is:

    Name     DR       CR     total 
303   B3  46.80     0.00     46.80                                                                                                         
304   B3  45.20     0.00      1.60                                                                                                                                                                                                                
395  BS1   0.00    10.37    -10.37                                                                                                         
396  BS2   0.00    87.00    -97.37                                                                                                         
397    C   0.00   482.10   -482.10                                                                                                                                                                                                               
399    C  20.00     0.00   -462.10

I am new to pandas. Thanks for your help

I have tried things like, but not working: df['total'] = df.groupby('GL')[(df.DR - df.CR)].cumsum()

Valdi_Bo · Accepted Answer

First variant - if you want to "join" all rows for each Name into a single group:

df['total'] = df.groupby('Name').apply(lambda grp:
    (grp.DR - grp.CR).cumsum()).reset_index(level=0, drop=True)

For your source data the result is:

    Name    DR      CR   total
303   B3  46.8    0.00   46.80
304   B3  45.2    0.00   92.00
395  BS1   0.0   10.37  -10.37
396  BS1   0.0   87.00  -97.37
397    C   0.0  482.10 -482.10
399    C  20.0    0.00 -462.10

Second variant - if any change in Name is to start a new group.

Assume that your DataFrame contains:

    Name    DR      CR
303   B3  46.8    0.00
304   B3  45.2    0.00
395  BS1   0.0   10.37
396  BS1   0.0   87.00
397    C   0.0  482.10
399    C  20.0    0.00
400   B3  53.0    8.00
401   B3  40.8    6.15

and the second B3 group is to be summed separately from the first group:

df['total'] = df.groupby((df.Name != df.Name.shift()).cumsum())\
    .apply(lambda grp: (grp.DR - grp.CR).cumsum()).reset_index(level=0, drop=True)

getting:

    Name    DR      CR   total
303   B3  46.8    0.00   46.80
304   B3  45.2    0.00   92.00
395  BS1   0.0   10.37  -10.37
396  BS1   0.0   87.00  -97.37
397    C   0.0  482.10 -482.10
399    C  20.0    0.00 -462.10
400   B3  53.0    8.00   45.00
401   B3  40.8    6.15   79.65

As you can see, the second B3 group is summed separately.

Pandas Dataframe cumsum function to "restart" when value in row changes

Answers (1)

Related Questions

Pandas Dataframe cumsum function to &quot;restart&quot; when value in row changes

Answers (1)

Related Questions

Pandas Dataframe cumsum function to "restart" when value in row changes