Cummulative sum between consecutive groups in a dataframe

Question

I'm having a dataframe in this form

  col_1  group_ids value
0 Name_A     1       1
1 Name_B     2       4
2 Name_C     4       2
3 Name_D     5       10 
4 Name_E     6       14

The goal is to sum all the values that the group_ids are consecutive, which is to produce the following table

  col_1  group_ids value consecutive_cumsum
0 Name_A     1       1          1
1 Name_B     2       4          5
2 Name_C     4       2          2
3 Name_D     5       10         12
4 Name_E     6       14         26

I got it to work with using itertuples(). But what could be a more efficient way to deal with it? Many thanks!

BENY · Accepted Answer

We can try diff with cumsum then groupby this new key do cumsum again

df.groupby(df.group_ids.diff().ne(1).cumsum()).value.cumsum()

0     1
1     5
2     2
3    12
4    26
Name: value, dtype: int64

Cummulative sum between consecutive groups in a dataframe

Answers (2)

Related Questions