Reputation: 3395
I have set up three masks for my df
, and I want to filter out these values.
For example, some random masks:
mask1 = df['column1'].isnull()
mask2 = df['column2'] > 5
mask3 = df['column3'].str.contains('hello')
Now how do I combine these masks to filter out these values?
Is this the correct way? Using both ~
and |
?
masked_df = df[~mask1 | ~mask2 | ~mask3]
I have so many rows in my dataframe that I can't be 100% sure with manual checking to see if it's correct.
Upvotes: 6
Views: 7393
Reputation: 863701
Your solution is nice, but also is posible use bitwise AND
and invert chained conditions:
masked_df = df[~(mask1 & mask2 & mask3)]
If masks are in list, solution above is rewritten with np.logical_and.reduce
:
masks = [mask1, mask2, mask3]
m = df[~np.logical_and.reduce(masks)]
print (m)
A column1 column2 column3
2 c 4.0 9 hello
3 d 5.0 4 hello
4 e 5.0 2 hello
5 f 4.0 3 hello
Upvotes: 12