ChinwalPrasad
ChinwalPrasad

Reputation: 327

How can I join 2 pandas dataframe using left join?

I have 2 dataframes which I need to join using left join. In sql I have the query as

SELECT A.* INTO NewTable FROM A LEFT JOIN B ON A.id=B.id WHERE B.id IS NULL;

I have the 2 dataframes as: df1:

id name
1 one
2 two
3 three
4 four

df2:

id
2
3

What I am expecting is:

id name
1 one
4 four

What I have tried?

common = df1.merge(df2, on=['id', 'id'])
result = df1[~df1.id.isin(common.id)]

I get more results in this then what the query returns. Any help is appreciated.

Upvotes: 0

Views: 1137

Answers (4)

SeaBean
SeaBean

Reputation: 23217

You can use left join with .merge() with indicator= parameter turned on. Then, filter the indicator values equal to "left_only" with .query(), as follows:

df1.merge(df2, on='id', how='left', indicator='ind').query('ind == "left_only"')

Result:

   id  name        ind
0   1   one  left_only
3   4  four  left_only

Optionally, you can remove the indicator column, as follows:

df1.merge(df2, on='id', how='left', indicator='ind').query('ind == "left_only"').drop('ind', axis=1)

Result:

   id  name
0   1   one
3   4  four

Upvotes: 1

nbk
nbk

Reputation: 49375

you have the right solution,only you do interpret the results wrong.

This will give you the result without index

import pandas as pd
d = {'id': [1, 2,3,4], 'col2': ['one','two','three','four']}
d1 = {'id': [2,3]}

df1 = pd.DataFrame(data=d)

df2 =  pd.DataFrame(data=d1)


result = df1[~df1.id.isin(df2.id)]
print(result.to_string(index=False))

enter image description here

Upvotes: 1

Piotr Żak
Piotr Żak

Reputation: 2132

import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.array([[1, "one"], [2, "two"], [3, "three"], [4, "four"]]),
                   columns=['id', 'name '])

df2 = pd.DataFrame(np.array([[1], [2]]),
                   columns=['id'])

df1.drop(df2['id'], axis=0,inplace=True)
df1

enter image description here

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195438

Try:

print(df1[~df1["id"].isin(df2["id"])])

Prints:

   id  name
0   1   one
3   4  four

Upvotes: 0

Related Questions