Reputation: 857
A similar question has been asked for cumsum
and grouping
but it didn't solve my case.
I have a financial balance sheet of a lot of years and need to sum all previous values by year.
This is my reproducible set:
df=pd.DataFrame(
{"Amount": [265.95,2250.00,-260.00,-2255.95,120],
"Year": [2018,2018,2018,2019,2019]})
The result I want is the following:
Year Amount
2017 0
2018 2255.95
2019 120.00
2020 120.00
So actually in a loop going from the lowest year in my whole set to the highest year in my set.
...
df[df.Year<=2017].Amount.sum()
df[df.Year<=2018].Amount.sum()
df[df.Year<=2019].Amount.sum()
df[df.Year<=2020].Amount.sum()
...
Upvotes: 0
Views: 141
Reputation: 862661
First step is aggregate sum
, then use Series.cumsum
and Series.reindex
with forward filling missing values by all possible years, last replace first missing values to 0
:
years = range(2017, 2021)
df1 = (df.groupby('Year')['Amount']
.sum()
.cumsum()
.reindex(years, method='ffill')
.fillna(0)
.reset_index())
print (df1)
Year Amount
0 2017 0.00
1 2018 2255.95
2 2019 120.00
3 2020 120.00
Upvotes: 1