Reputation: 2947
I'm using Pandas itertuples
method to perform row matching between two dataframes (df-A & df-B). The result is a copy of df-B with only the passed results.
df-B before:
B Y
0 2 10
1 4 15
2 6 15
df-B after (as df-B2):
B Y e
0 2 10 True
1 6 15 True
How can I compare df-B and df-B2 and return only the missing (implied False) rows?
B Y
1 4 15
Upvotes: 0
Views: 990
Reputation: 862911
Use ~
for inverse of mask:
df-B[~df-B.e]
Solution with DataFrame.isin
is more general because it check index and column values too. So column Y
is set to index and then get mask. Last use boolean indexing
:
print (df1)
B Y
0 2 10
1 4 15
2 6 15
print (df2)
B Y e
0 2 10 True
2 6 15 True
df11 = df1.set_index('B')[['Y']]
df22 = df2.set_index('B')[['Y']]
mask = df11.isin(df22).reset_index()
print (mask)
B Y
0 2 True
1 4 False
2 6 True
print (df1[mask.Y])
B Y
0 2 10
2 6 15
print (df1[~mask.Y])
B Y
1 4 15
Upvotes: 1
Reputation: 4855
This comes from the excellent Pandas Cheat Sheet and I think it will do what you want:
pd.merge(df-B, df-B2, how='outer',
indicator=True)
.query('_merge == "left_only"')
.drop(['_merge'],axis=1)
This is a way to get rows that appear in df-B that don't appear in df-B2.
Upvotes: 2