user12367917
user12367917

Reputation:

Sum column values in a dataframe if values in another column are next to each other

Hello I have a dataframe:

import pandas as pd

df1 = {'name': ["x","x","x","x","x","x","x","y","y","y","y","y","y","y"],
       'a': [3,4,5,11,14,15,16,2,3,4,10,13,14,15],
       'b': [9,8,7,12,23,22,21,8,7,6,11,22,21,20],
       'val': [2,1,3,4,5,6,3,21,11,31,41,51,61,31]    
        }

df1 = pd.DataFrame (df1, columns = ['name','a','b','val'])

I wish to sum the numbers in the 'val' column if the numbers in the 'a' column are next to one another. E.g. in 'a' you have 3,4,5 (all next to each other) so add together their associated numbers in the 'val' column (i.e. 2+1+3) and then create a new column where the added value is present. The harder bit for me is grouping these by 'name'.

I don't know how well I've explained this, but here is the dataframe i wish to end up with

df2 = {'name': ["x","x","x","x","x","x","x","y","y","y","y","y","y","y"],
       'a': [3,4,5,11,14,15,16,2,3,4,10,13,14,15],
       'b': [9,8,7,12,23,22,21,8,7,6,11,22,21,20],
       'val': [2,1,3,4,5,6,3,21,11,31,41,51,61,31],
       'sum_val': [6,6,6,4,14,14,14,63,63,63,41,143,143,143]
        }

df2 = pd.DataFrame (df2, columns = ['name','a','b','val','sum_val'])

Upvotes: 1

Views: 173

Answers (1)

jezrael
jezrael

Reputation: 863301

Create groups by compare difference for not equal with cumulative sum per groups in lambda function and pass Series to GroupBy.transform with sum:

g = df1.groupby('name')['a'].apply(lambda x: x.diff().ne(1).cumsum())

df1['sum_val'] = df1.groupby([g, 'name'])['val'].transform('sum')
print (df1)

   name   a   b  val  sum_val
0     x   3   9    2        6
1     x   4   8    1        6
2     x   5   7    3        6
3     x  11  12    4        4
4     x  14  23    5       14
5     x  15  22    6       14
6     x  16  21    3       14
7     y   2   8   21       63
8     y   3   7   11       63
9     y   4   6   31       63
10    y  10  11   41       41
11    y  13  22   51      143
12    y  14  21   61      143
13    y  15  20   31      143

Upvotes: 1

Related Questions