Amie
Amie

Reputation: 103

Cumulative Sum in Pandas

I have a dataframe which looks like this:

project_code   start_date   end_date   date       spend
     489        5/15/18     5/15/19     3/1/19     100
     489        5/15/18     5/15/19     4/1/19     250
     489        5/15/18     5/15/19     5/1/19     50
     511        4/1/19      4/1/20      2/1/20     90
     511        4/1/19      4/1/20      3/1/20     50       
     489        5/15/19     5/15/20     3/1/20     100

I need to create another column in the same table which calculates the cumulative spend for that subscription period (defined by start and end date). So it should add all previous spends under a project code so long as they have the same start/end date.

project_code   start_date   end_date   date       spend    cumulative_subscription_spend
     489        5/15/18     5/15/19     3/1/19     100           100
     489        5/15/18     5/15/19     4/1/19     250           350
     489        5/15/18     5/15/19     5/1/19     50            400
     511        4/1/19      4/1/20      2/1/20     90            90
     511        4/1/19      4/1/20      3/1/20     50            140
     489        6/1/19      6/1/20      3/1/20     100           100

Most versions I have seen of this problem use groupby/aggregate but I am having trouble figuring out how that would work as a new column in the same table.

Upvotes: 1

Views: 194

Answers (1)

BENY
BENY

Reputation: 323226

Check with groupby + cumsum

df['cumulative_subscription_spend'] = df.groupby('project_code')['spend'].cumsum()

Upvotes: 1

Related Questions