Nazar Tarlanli
Nazar Tarlanli

Reputation: 53

Math operations of column based on same values of other column pandas

My first dataframe is:

df1 =     


    A        B
    61880    7
    62646    8
    62651    9
    62656    10
    62783    11

My second dataframe is:

df2 =

    C        D
    62783    2
    62646    3
    61880    4
    62656    5
    62651    6

As you can see both in first and second data frame, we have columns with same values bun not in the same order(!) (col A and col C)

Desired output is: Take all values of "B" and "D" from rows which have the same value of "A" and "C" and do a math operation on them (for example, B divided by D).

Example: 2 + 11 (both of them have same value of 62783 in columns "A"and "C")

ADDITION! EDITED!

Thanks a lot! I've faced with another issue I forgot to mention:

Sometimes, in column "A" I have same values, like for instance we can see the "61880" twice and so on:

df1 =

A        B
*61880*    7
**62646**    8
62651    9
62656    10
62783    11
*61880*    3
**62646**    2

I want to do same process you have mentioned BUT by taking into account the following:

I want to sort column "A" based on values of "B", namely sum of "B". Something like:

 61880    7+3
 62646    8+2
 ...

I did it with data.groupby('mm_fid')['vel'].sum() I got result, but I can't do operation after that. Thus, I want do create unique column with sum of "B" and then proceed with answers you provided!

Upvotes: 1

Views: 821

Answers (2)

jezrael
jezrael

Reputation: 863166

I believe you need DataFrame.add with DataFrame.set_index:

df3 = df1.set_index('A')['B'].add(df2.set_index('C')['D'], fill_value=0).reset_index()
df3.columns = ['A','B']
print (df3)
       A   B
0  61880  11
1  62646  11
2  62651  15
3  62656  15
4  62783  13

Another solution is use concat with aggregate sum:

d = {'C':'A', 'D':'B'}
df3 = pd.concat([df1, df2.rename(columns=d)]).groupby('A', as_index=False)['B'].sum()
print (df3)
       A   B
0  61880  11
1  62646  11
2  62651  15
3  62656  15
4  62783  13

Upvotes: 1

yatu
yatu

Reputation: 88276

You need a merge, then simply add the corresponding values:

res = df1.merge(df2, left_on='A', right_on='C')
(res.B + res.D).to_frame('result').set_index(res.A)

        result
A            
61880      15
62646      14
62651      21
62656      20
62783      15

Upvotes: 3

Related Questions