Shivam Raj
Shivam Raj

Reputation: 64

add values of two columns from 2 different dataframes pandas

i want to add 2 columns of 2 different dataframes based on the condition that name is same:

import pandas as pd
df1 = pd.DataFrame([("Apple",2),("Litchi",4),("Orange",6)], columns=['a','b'])
df2 = pd.DataFrame([("Apple",200),("Orange",400),("Litchi",600)], columns=['a','c'])

now i want to add column b and c if the name is same in a.

I tried this df1['b+c']=df1['b']+df2['c'] but it simply adds column b and c so the result comes as

      a    b  b+c
0   Apple  2  202
1  Litchi  4  404
2  Orange  6  606

but i want to

    a  b+c
0   Apple  202
1  Litchi  604
2  Orange  406

i guess i have to use isin but i am not getting how?

Upvotes: 2

Views: 6734

Answers (2)

G.G
G.G

Reputation: 765

duckdb:

df1.sql.set_alias("tb1").join(df2.sql.set_alias("tb2"),condition="tb1.a=tb2.a",how="inner").select("tb1.a,b+c")

┌─────────┬─────────┐
│    a    │ (b + c) │
│ varchar │  int64  │
├─────────┼─────────┤
│ Apple   │     202 │
│ Litchi  │     604 │
│ Orange  │     406 │
└─────────┴─────────┘

Upvotes: 0

jezrael
jezrael

Reputation: 863531

Columns b and c are aligned by index values in sum operation, so is necessary create index by DataFrame.set_index by column a:

s1 = df1.set_index('a')['b']
s2 = df2.set_index('a')['c']
df1 = (s1+s2).reset_index(name='b+c')
print (df1)
        a  b+c
0   Apple  202
1  Litchi  604
2  Orange  406

EDIT: If need original value for not matched values use Series.add with parameter fill_value=0

df2 = pd.DataFrame([("Apple",200),("Apple",400),("Litchi",600)], columns=['a','c']) 
print (df2)
        a    c
0   Apple  200
1   Apple  400
2  Litchi  600

s1 = df1.set_index('a')['b']
s2 = df2.set_index('a')['c']
df1 = s1.add(s2, fill_value=0).reset_index(name='b+c')
print (df1)
        a    b+c
0   Apple  202.0
1   Apple  402.0
2  Litchi  604.0
3  Orange    6.0

Upvotes: 3

Related Questions