Joe_ft
Joe_ft

Reputation: 341

Cumulate data based on two column values in df python

I am trying to cumulate the data usage based on the ID and the month so that the cumulative data will be added as an additional column.

This is my code for a sample of the df:

df = pd.DataFrame({'ID':["SAD1", "SAD2", "SAD1", "SAD2", "SAD1", "SAD2", "SAD3"], 
               'Month':["201701","201701","201702","201702","201703","201703", "201703"], 
               'Usage':[20,40,100,50, 30, 30, 2000]})

My dataframe looks like this:

     ID   Month  Usage
0  SAD1  201701     20
1  SAD2  201701     40
2  SAD1  201702    100
3  SAD2  201702     50
4  SAD1  201703     30
5  SAD2  201703     30
6  SAD3  201703   2000

The result I want to end up with is:

     ID   Month  Usage  Cum Usage
0  SAD1  201701     20         20
1  SAD2  201701     40         40
2  SAD1  201702    100        120
3  SAD2  201702     50         90
4  SAD1  201703     30        150
5  SAD2  201703     30        120 
6  SAD3  201703   2000       2000

Can anyone help me out how to apply this to my df?

Upvotes: 0

Views: 399

Answers (1)

gzc
gzc

Reputation: 8639

groupby and cumsum

df['Cum Usage'] = df.groupby('ID').cumsum()

Upvotes: 4

Related Questions