Reputation:
I have a pandas dataframe with LTD and CycleCount columns and need to fill CycleSum column like this:
LTD CycleCount CycleSum
1/1/2017 1 1
1/1/2017 1 1
2/1/2017 0 1
2/1/2017 0 1
3/1/2017 1 2
3/1/2017 1 2
4/1/2017 1 3
4/1/2017 1 3
I tried to do:
df['CycleSum']=df.groupby('LTD')['CycleCount'].prod()
to get unique LTD with 1/0 values in CycleCount and this is OK as the first step for me, because I got a group:
LTD CycleCount
1/1/2017 1
2/1/2017 0
3/1/2017 1
4/1/2017 1
But I can't find a way how to make a cumsum on this group and then transform it back to original dataframe.
CycleCount can be only 1 or 0 and the same for the each LTD. But CycleSum must increase for any different LTD with CycleCount=1 It is easy to do in Excel but difficult to realize in Python for me.
Any suggestions? The purpose to make all in one action
PS I'm former VBA developer and maybe this is a noob case, but it is extremelly difficult to go away from Excel logic.
Upvotes: 2
Views: 159
Reputation: 294258
do the groupby
+ prod
followed by cumsum
then join
df.join(df.groupby('LTD').CycleCount.prod().cumsum().rename('CycleSum'), on='LTD')
LTD CycleCount CycleSum
0 2017-01-01 1 1
1 2017-01-01 1 1
2 2017-02-01 0 1
3 2017-02-01 0 1
4 2017-03-01 1 2
5 2017-03-01 1 2
6 2017-04-01 1 3
7 2017-04-01 1 3
These would also produce the same results
df.join(df.groupby('LTD').CycleCount.all().cumsum().rename('CycleSum'), on='LTD')
df.join(df.groupby('LTD').CycleCount.max().cumsum().rename('CycleSum'), on='LTD')
LTD CycleCount CycleSum
0 2017-01-01 1 1
1 2017-01-01 1 1
2 2017-02-01 0 1
3 2017-02-01 0 1
4 2017-03-01 1 2
5 2017-03-01 1 2
6 2017-04-01 1 3
7 2017-04-01 1 3
Upvotes: 1