Reputation: 877
Similar to below
Buyer Seller Amount
John Mary 3
Mary John 2
David Bosco 2
Where I want to sum John and Mary rows into one
Expected out come
Trade1 Trade2 Amount
John Mary 5
David Bosco 2
My dataframe has around 6000 rows. Thank you for your help
Upvotes: 1
Views: 254
Reputation: 863301
First sort values by numpy.sort
and create boolean mask by DataFrame.duplicated
and then aggregate sum
:
df[['Buyer','Seller']] = pd.DataFrame(np.sort(df[['Buyer','Seller']], axis=1))
df2 = df.groupby(['Buyer','Seller'], as_index=False)['Amount'].sum()
df2.columns = ['Trade1','Trade2','Amount']
print (df2)
Trade1 Trade2 Amount
0 Bosco David 2
1 John Mary 5
If dont want modify original columns use syntactic sugar - groupby
with Series
:
df1 = pd.DataFrame(np.sort(df[['Buyer','Seller']], axis=1))
df1.columns = ['Trade1','Trade2']
df2 = df['Amount'].groupby([df1['Trade1'],df1['Trade2']]).sum().reset_index()
print (df2)
Trade1 Trade2 Amount
0 Bosco David 2
1 John Mary 5
Upvotes: 1