Reputation: 327
I have 2 dataframes which I need to join using left join. In sql I have the query as
SELECT A.* INTO NewTable FROM A LEFT JOIN B ON A.id=B.id WHERE B.id IS NULL;
I have the 2 dataframes as: df1:
id | name |
---|---|
1 | one |
2 | two |
3 | three |
4 | four |
df2:
id |
---|
2 |
3 |
What I am expecting is:
id | name |
---|---|
1 | one |
4 | four |
What I have tried?
common = df1.merge(df2, on=['id', 'id'])
result = df1[~df1.id.isin(common.id)]
I get more results in this then what the query returns. Any help is appreciated.
Upvotes: 0
Views: 1137
Reputation: 23217
You can use left join with .merge()
with indicator=
parameter turned on. Then, filter the indicator values equal to "left_only"
with .query()
, as follows:
df1.merge(df2, on='id', how='left', indicator='ind').query('ind == "left_only"')
Result:
id name ind
0 1 one left_only
3 4 four left_only
Optionally, you can remove the indicator column, as follows:
df1.merge(df2, on='id', how='left', indicator='ind').query('ind == "left_only"').drop('ind', axis=1)
Result:
id name
0 1 one
3 4 four
Upvotes: 1
Reputation: 49375
you have the right solution,only you do interpret the results wrong.
This will give you the result without index
import pandas as pd
d = {'id': [1, 2,3,4], 'col2': ['one','two','three','four']}
d1 = {'id': [2,3]}
df1 = pd.DataFrame(data=d)
df2 = pd.DataFrame(data=d1)
result = df1[~df1.id.isin(df2.id)]
print(result.to_string(index=False))
Upvotes: 1
Reputation: 2132
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.array([[1, "one"], [2, "two"], [3, "three"], [4, "four"]]),
columns=['id', 'name '])
df2 = pd.DataFrame(np.array([[1], [2]]),
columns=['id'])
df1.drop(df2['id'], axis=0,inplace=True)
df1
Upvotes: 0
Reputation: 195438
Try:
print(df1[~df1["id"].isin(df2["id"])])
Prints:
id name
0 1 one
3 4 four
Upvotes: 0