Reputation: 83
I have this DataFrame :
Value Month
0 1
1 2
8 3
11 4
12 5
17 6
0 7
0 8
0 9
0 10
1 11
2 12
7 1
3 2
1 3
0 4
0 5
And i want to create new variable "Cumsum" like this :
Value Month Cumsum
0 1 0
1 2 1
8 3 9
11 4 20
12 5 32
17 6
0 7
0 8 ...
0 9
0 10
1 11
2 12
7 1 7
3 2 10
1 3 11
0 4 11
0 5 11
Sorry if my code it is not clean, I failed to include my dataframe ...
My problem is that I do not have only 12 lines (1 line per month) but I have many more lines. By cons I know that my table is tidy and I want to have the cumulated amount until the 12th month and repeat that when the month 1 appears.
Thank you for your help.
Upvotes: 2
Views: 161
Reputation: 4233
Try:
df['Cumsum'] = df.groupby((df.Month == 1).cumsum())['Value'].cumsum()
print(df)
Value Month Cumsum
0 0 1 0
1 1 2 1
2 8 3 9
3 11 4 20
4 12 5 32
5 17 6 49
6 0 7 49
7 0 8 49
8 0 9 49
9 0 10 49
10 1 11 50
11 2 12 52
12 7 1 7
13 3 2 10
14 1 3 11
15 0 4 11
16 0 5 11
Upvotes: 2
Reputation: 5334
code:
df = pd.DataFrame({'value': [0, 1, 8, 11, 12, 17, 0, 0, 0, 0, 1, 2, 7, 3, 1, 0, 0],
'month': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5]})
temp = int(len(df)/12)
for i in range(temp + 1):
start = i * 12
if i < temp:
end = (i + 1) * 12 - 1
df.loc[start:end, 'cumsum'] = df.loc[start:end, 'value'].cumsum()
else:
df.loc[start:, 'cumsum'] = df.loc[start:, 'value'].cumsum()
# df.loc[12:, 'cumsum'] = 12
print(df)
output:
value month cumsum
0 0 1 0.0
1 1 2 1.0
2 8 3 9.0
3 11 4 20.0
4 12 5 32.0
5 17 6 49.0
6 0 7 49.0
7 0 8 49.0
8 0 9 49.0
9 0 10 49.0
10 1 11 50.0
11 2 12 52.0
12 7 1 7.0
13 3 2 10.0
14 1 3 11.0
15 0 4 11.0
16 0 5 11.0
Upvotes: 1