Reputation: 1617
Assume I have two csv file csv1 and csv2. Now I will to delete all record from csv2 if any record match with csv1. Both csv have unique identifier sku.
csv1:
sku name
Gk125 Jhone
GK126 Mike
csv2:
sku name
Gk127 Doe
GK128 Hock
GK126 Mike #this is the duplicate record which already in csv1
my expected result for csv2 will be
sku name
Gk127 Doe
GK128 Hock
I tried this but didn't work:
old_file = list(old['sku'])
updated = new[~new['sku'].isin(old)]
updated.to_csv('...my path/updated.csv')
Upvotes: 2
Views: 1405
Reputation: 4229
Works fine for me:
df1 = pd.DataFrame(data={'sku':['Gk125', 'GK126'], 'name':['Jhone', 'Mike']})
df2 = pd.DataFrame(data={'sku':['Gk127', 'GK128', 'GK126'], 'name':['Doe', 'Hock', 'Mike']})
print(df2[~df2['sku'].isin(df1['sku'])])
Output:
sku name
0 Gk127 Doe
1 GK128 Hock
Upvotes: 4