Reputation: 103
I have a dataframe which looks like this:
project_code start_date end_date date spend
489 5/15/18 5/15/19 3/1/19 100
489 5/15/18 5/15/19 4/1/19 250
489 5/15/18 5/15/19 5/1/19 50
511 4/1/19 4/1/20 2/1/20 90
511 4/1/19 4/1/20 3/1/20 50
489 5/15/19 5/15/20 3/1/20 100
I need to create another column in the same table which calculates the cumulative spend for that subscription period (defined by start and end date). So it should add all previous spends under a project code so long as they have the same start/end date.
project_code start_date end_date date spend cumulative_subscription_spend
489 5/15/18 5/15/19 3/1/19 100 100
489 5/15/18 5/15/19 4/1/19 250 350
489 5/15/18 5/15/19 5/1/19 50 400
511 4/1/19 4/1/20 2/1/20 90 90
511 4/1/19 4/1/20 3/1/20 50 140
489 6/1/19 6/1/20 3/1/20 100 100
Most versions I have seen of this problem use groupby/aggregate but I am having trouble figuring out how that would work as a new column in the same table.
Upvotes: 1
Views: 194
Reputation: 323226
Check with groupby
+ cumsum
df['cumulative_subscription_spend'] = df.groupby('project_code')['spend'].cumsum()
Upvotes: 1