xtian
xtian

Reputation: 2947

How do I join two dataframes on Boolean and return the inverse

I'm using Pandas itertuples method to perform row matching between two dataframes (df-A & df-B). The result is a copy of df-B with only the passed results.

df-B before:

   B   Y
0  2  10
1  4  15
2  6  15

df-B after (as df-B2):

   B   Y     e
0  2  10  True
1  6  15  True

How can I compare df-B and df-B2 and return only the missing (implied False) rows?

   B   Y
1  4  15

Upvotes: 0

Views: 990

Answers (2)

jezrael
jezrael

Reputation: 862911

Use ~ for inverse of mask:

df-B[~df-B.e]

Solution with DataFrame.isin is more general because it check index and column values too. So column Y is set to index and then get mask. Last use boolean indexing:

print (df1)
    B   Y
0  2  10
1  4  15
2  6  15

print (df2)
   B   Y     e
0  2  10  True
2  6  15  True

df11 = df1.set_index('B')[['Y']]
df22 = df2.set_index('B')[['Y']]
mask = df11.isin(df22).reset_index()
print (mask)
   B      Y
0  2   True
1  4  False
2  6   True

print (df1[mask.Y])
   B   Y
0  2  10
2  6  15

print (df1[~mask.Y])
   B   Y
1  4  15

Upvotes: 1

Craig
Craig

Reputation: 4855

This comes from the excellent Pandas Cheat Sheet and I think it will do what you want:

pd.merge(df-B, df-B2, how='outer',
    indicator=True)
    .query('_merge == "left_only"')
    .drop(['_merge'],axis=1)

This is a way to get rows that appear in df-B that don't appear in df-B2.

Upvotes: 2

Related Questions