Reputation: 53
My first dataframe is:
df1 =
A B
61880 7
62646 8
62651 9
62656 10
62783 11
My second dataframe is:
df2 =
C D
62783 2
62646 3
61880 4
62656 5
62651 6
As you can see both in first and second data frame, we have columns with same values bun not in the same order(!) (col A and col C)
Desired output is: Take all values of "B" and "D" from rows which have the same value of "A" and "C" and do a math operation on them (for example, B divided by D).
Example: 2 + 11 (both of them have same value of 62783 in columns "A"and "C")
ADDITION! EDITED!
Thanks a lot! I've faced with another issue I forgot to mention:
Sometimes, in column "A" I have same values, like for instance we can see the "61880" twice and so on:
df1 =
A B
*61880* 7
**62646** 8
62651 9
62656 10
62783 11
*61880* 3
**62646** 2
I want to do same process you have mentioned BUT by taking into account the following:
I want to sort column "A" based on values of "B", namely sum of "B". Something like:
61880 7+3
62646 8+2
...
I did it with data.groupby('mm_fid')['vel'].sum()
I got result, but I can't do operation after that. Thus, I want do create unique column with sum of "B" and then proceed with answers you provided!
Upvotes: 1
Views: 821
Reputation: 863166
I believe you need DataFrame.add
with DataFrame.set_index
:
df3 = df1.set_index('A')['B'].add(df2.set_index('C')['D'], fill_value=0).reset_index()
df3.columns = ['A','B']
print (df3)
A B
0 61880 11
1 62646 11
2 62651 15
3 62656 15
4 62783 13
Another solution is use concat
with aggregate sum
:
d = {'C':'A', 'D':'B'}
df3 = pd.concat([df1, df2.rename(columns=d)]).groupby('A', as_index=False)['B'].sum()
print (df3)
A B
0 61880 11
1 62646 11
2 62651 15
3 62656 15
4 62783 13
Upvotes: 1