Ladenkov Vladislav
Ladenkov Vladislav

Reputation: 1297

Pandas analogue of SQL's "NOT IN" operator

Suprisingly, i can't find an analogue of SQL's "NOT IN" operator in pandas DataFrames.

A = pd.DataFrame({'a':[6,8,3,9,5],
                       'b':['II','I','I','III','II']})

B = pd.DataFrame({'c':[1,2,3,4,5]})

I want all rows from A, which a doesn't contain values from B's c. Something like:

A = A[ A.a not in B.c]

Upvotes: 8

Views: 8577

Answers (1)

jezrael
jezrael

Reputation: 863166

I think you are really close - need isin with ~ for negate boolean mask - also instead list use Series B.c:

print (~A.a.isin(B.c))
0     True
1     True
2    False
3     True
4    False
Name: a, dtype: bool

A = A[~A.a.isin(B.c)]
print (A)
   a    b
0  6   II
1  8    I
3  9  III

Upvotes: 9

Related Questions