lb_so
lb_so

Reputation: 146

Python: Cumulative Sum with changing key

I have a table of data such as:

F(1)  F(2)  F(3)  Amount
A     B     C     100
A     B     C     100
A     B     C     100
D     E     F     300
D     E     F     150
G     H     I     100
G     H     I     200

I would like to produce a new column, showing the cumulative sum of field 'Amount', but one that resets to 0 whenever the key of columns F(1), F(2) and F(3) change.

i.e. I would like to create the following output (sans dotted lines!)

F(1)  F(2)  F(3)  Amount  CumSum
A     B     C     100     100
A     B     C     100     200
A     B     C     100     300
------------------------------ resets to zero as key changes
D     E     F     300     300
D     E     F     150     450
------------------------------ resets to zero as key changes
G     H     I     100     100
G     H     I     200     300

I have potentially up to a million rows in this table so I am looking for a robust implementation. Is pandas the way forward here? I have not used pandas before but am happy to explore.

Upvotes: 0

Views: 259

Answers (1)

user2285236
user2285236

Reputation:

group by your key columns and call cumsum:

df['CumSum'] = df.groupby(['F(1)', 'F(2)', 'F(3)'])['Amount'].cumsum()

df
Out: 
  F(1) F(2) F(3)  Amount  CumSum
0    A    B    C     100     100
1    A    B    C     100     200
2    A    B    C     100     300
3    D    E    F     300     300
4    D    E    F     150     450
5    G    H    I     100     100
6    G    H    I     200     300

Upvotes: 1

Related Questions