Reputation: 107
I have two dataframes:
df1 = pd.DataFrame({'System':['b0001','b0002']})
df2 = pd.DataFrame({'System':['b0001']})
I want to print the value in column System of df1 that is NOT contained in column System of df2. The output should only be:
b0002
My current code is:
for i in df1.index:
if df1.System[i] not in df2.System:
print (df1.System[i])
But the output is:
b0001
b0002
I cant'f figure out why it still prints out b0001
. I've tried with isin
and the output is the same.
Any help will be appreciated.
Upvotes: 4
Views: 117
Reputation: 19947
# This solution only prints unique elements in df1 which are not in df2
np.setdiff1d(df1,df2)
Out[236]: array(['b0002'], dtype=object)
Upvotes: 2
Reputation: 294258
numpy
np.setdiff1d(df1.System.values, df2.System.values)
array(['b0002'], dtype=object)
Upvotes: 3
Reputation: 153460
A pandas way of doing this is to use isin
as follows:
df1[~df1.System.isin(df2.System)]
Output:
System
1 b0002
However, to do it the way you are doing you are missing .values
:
for i in df1.index:
if df1.System[i] not in df2.System.values:
print (df1.System[i])
Output:
b0002
Upvotes: 4