Sum column values in a dataframe if values in another column are next to each other

Question

Hello I have a dataframe:

import pandas as pd

df1 = {'name': ["x","x","x","x","x","x","x","y","y","y","y","y","y","y"],
       'a': [3,4,5,11,14,15,16,2,3,4,10,13,14,15],
       'b': [9,8,7,12,23,22,21,8,7,6,11,22,21,20],
       'val': [2,1,3,4,5,6,3,21,11,31,41,51,61,31]    
        }

df1 = pd.DataFrame (df1, columns = ['name','a','b','val'])

I wish to sum the numbers in the 'val' column if the numbers in the 'a' column are next to one another. E.g. in 'a' you have 3,4,5 (all next to each other) so add together their associated numbers in the 'val' column (i.e. 2+1+3) and then create a new column where the added value is present. The harder bit for me is grouping these by 'name'.

I don't know how well I've explained this, but here is the dataframe i wish to end up with

df2 = {'name': ["x","x","x","x","x","x","x","y","y","y","y","y","y","y"],
       'a': [3,4,5,11,14,15,16,2,3,4,10,13,14,15],
       'b': [9,8,7,12,23,22,21,8,7,6,11,22,21,20],
       'val': [2,1,3,4,5,6,3,21,11,31,41,51,61,31],
       'sum_val': [6,6,6,4,14,14,14,63,63,63,41,143,143,143]
        }

df2 = pd.DataFrame (df2, columns = ['name','a','b','val','sum_val'])

jezrael · Accepted Answer

Create groups by compare difference for not equal with cumulative sum per groups in lambda function and pass Series to GroupBy.transform with sum:

g = df1.groupby('name')['a'].apply(lambda x: x.diff().ne(1).cumsum())

df1['sum_val'] = df1.groupby([g, 'name'])['val'].transform('sum')
print (df1)

   name   a   b  val  sum_val
0     x   3   9    2        6
1     x   4   8    1        6
2     x   5   7    3        6
3     x  11  12    4        4
4     x  14  23    5       14
5     x  15  22    6       14
6     x  16  21    3       14
7     y   2   8   21       63
8     y   3   7   11       63
9     y   4   6   31       63
10    y  10  11   41       41
11    y  13  22   51      143
12    y  14  21   61      143
13    y  15  20   31      143

Sum column values in a dataframe if values in another column are next to each other

Answers (1)

Related Questions