SCool
SCool

Reputation: 3395

How to apply multiple masks to a dataframe at the same time?

I have set up three masks for my df, and I want to filter out these values.

For example, some random masks:

mask1 = df['column1'].isnull()
mask2 = df['column2'] > 5
mask3 = df['column3'].str.contains('hello')

Now how do I combine these masks to filter out these values? Is this the correct way? Using both ~ and | ?

masked_df = df[~mask1 | ~mask2 | ~mask3]

I have so many rows in my dataframe that I can't be 100% sure with manual checking to see if it's correct.

Upvotes: 6

Views: 7393

Answers (1)

jezrael
jezrael

Reputation: 863701

Your solution is nice, but also is posible use bitwise AND and invert chained conditions:

masked_df = df[~(mask1 & mask2 & mask3)]

If masks are in list, solution above is rewritten with np.logical_and.reduce:

masks = [mask1, mask2, mask3]

m = df[~np.logical_and.reduce(masks)]
print (m)
   A  column1  column2 column3
2  c      4.0        9   hello
3  d      5.0        4   hello
4  e      5.0        2   hello
5  f      4.0        3   hello

Upvotes: 12

Related Questions