Merging rows while updating another column in pandas

Question

node1   node2   weight

2          6     1

2          7     1

2          7     1

2          8     1

2         15     1

2         15     1

2         15     1

2         15     1

As can be seen from above I want to merge the rows where node1==node2 and update the weight wherever this condition is met so that there is only one row with unique node 1 and node 2 and the weight is the no of occurence of the equality condition.

Sample output is :

node 1       node 2        weight 

  2           7               2

  2           15              4

and so on.

Bharath M Shetty · Accepted Answer

If you have dataframe like

   node1  node2  weight
0      2      6       1
1      2      7       1
2      2      7       1
3      2      8       1
4      2     15       1
5      2     15       1
6      2     15       1
7      2     15       1

Option 1 : groupby sum

df.groupby(['node1','node2']).sum().reset_index()

  node1  node2  weight
0      2      6       1
1      2      7       2
2      2      8       1
3      2     15       4

Option 2 Pivot table with agg func as sum

df.pivot_table(index=['node1','node2'],aggfunc=sum).reset_index()

    node1  node2  weight
0      2      6       1
1      2      7       2
2      2      8       1
3      2     15       4

If you want group occurance more the once then use filter first then groupby sum i.e

ndf = df.groupby(['node1','node2']).filter(lambda x : len(x) > 1)
ndf = ndf.groupby(['node1','node2']).sum().reset_index()

   node1  node2  weight
0      2      7       2
1      2     15       4

Or

ndf = df.groupby(['node1','node2']).sum().reset_index()
ndf[ndf['weight'].ne(1)]

Merging rows while updating another column in pandas

Answers (1)

Related Questions