Reputation: 269
df['total'] = (df.DR - df.CR).cumsum()
is giving:
...
Name DR CR total
303 B3 46.80 0.00 46682.07
304 B3 45.20 0.00 46727.27
395 BS1 0.00 10.37 47905.31
396 BS2 0.00 87.00 47818.31
397 C 0.00 482.10 47336.21
399 C 20.00 0.00 47356.21
However I would like the cumsum to "restart" whenever the "Name"
column (B3,BS1,C) changes to a different value.
So the desired result is:
Name DR CR total
303 B3 46.80 0.00 46.80
304 B3 45.20 0.00 1.60
395 BS1 0.00 10.37 -10.37
396 BS2 0.00 87.00 -97.37
397 C 0.00 482.10 -482.10
399 C 20.00 0.00 -462.10
I am new to pandas. Thanks for your help
I have tried things like, but not working:
df['total'] = df.groupby('GL')[(df.DR - df.CR)].cumsum()
Upvotes: 0
Views: 1024
Reputation: 30991
First variant - if you want to "join" all rows for each Name into a single group:
df['total'] = df.groupby('Name').apply(lambda grp:
(grp.DR - grp.CR).cumsum()).reset_index(level=0, drop=True)
For your source data the result is:
Name DR CR total
303 B3 46.8 0.00 46.80
304 B3 45.2 0.00 92.00
395 BS1 0.0 10.37 -10.37
396 BS1 0.0 87.00 -97.37
397 C 0.0 482.10 -482.10
399 C 20.0 0.00 -462.10
Second variant - if any change in Name is to start a new group.
Assume that your DataFrame contains:
Name DR CR
303 B3 46.8 0.00
304 B3 45.2 0.00
395 BS1 0.0 10.37
396 BS1 0.0 87.00
397 C 0.0 482.10
399 C 20.0 0.00
400 B3 53.0 8.00
401 B3 40.8 6.15
and the second B3 group is to be summed separately from the first group:
df['total'] = df.groupby((df.Name != df.Name.shift()).cumsum())\
.apply(lambda grp: (grp.DR - grp.CR).cumsum()).reset_index(level=0, drop=True)
getting:
Name DR CR total
303 B3 46.8 0.00 46.80
304 B3 45.2 0.00 92.00
395 BS1 0.0 10.37 -10.37
396 BS1 0.0 87.00 -97.37
397 C 0.0 482.10 -482.10
399 C 20.0 0.00 -462.10
400 B3 53.0 8.00 45.00
401 B3 40.8 6.15 79.65
As you can see, the second B3 group is summed separately.
Upvotes: 1