Reputation: 1792

How to delete all rows from a group, if a different column meets condition in pandas

I have the following dataframe and then a calculation to identify outliers:

import pandas as pd
#Create DF
d = { 
    'Day': ['M','M','M','T','T','T','W','W','W','W','W','W','W','F','F','F','F'],
    'Criteria':[30,35,2.5,3,6,15,20,1.5,4,25,50,75,2,4,9,2,36]  
    }           
df = pd.DataFrame(data=d)
#Outliers calc
lower_limit = df.Criteria.quantile(.05)
upper_limit = df.Criteria.quantile(.95)
df['Flag'] = df.Criteria.between(lower_limit, upper_limit)

df

You can see that the Day W contails an outlier (2 in this case).

Essentially if a Flag equals False I would like to drop all rows from that Day group.

So my expected result is all 'W' Days are removed from the table. Thanks in advance!

Upvotes: 1

Answers (3)

BENY

Reputation: 323226

Try this should be fast than filter

out = df[~df.Day.isin(df.loc[~df.Flag,'Day'])]
Out[20]: 
   Day  Criteria  Flag
0    M      30.0  True
1    M      35.0  True
2    M       2.5  True
3    T       3.0  True
4    T       6.0  True
5    T      15.0  True
13   F       4.0  True
14   F       9.0  True
15   F       2.0  True
16   F      36.0  True

Upvotes: 0

tomtomfox

Reputation: 304

I think that's what you want :

import numpy as np

df = df[df.groupby('Day').Flag.transform(np.all)]
df

Upvotes: 1

Mustafa Aydın

Reputation: 18306

Groupby.filter: retains those groups of Days that have their Flags all True:

>>> df.groupby("Day").filter(lambda d: d.Flag.all())

   Day  Criteria  Flag
0    M      30.0  True
1    M      35.0  True
2    M       2.5  True
3    T       3.0  True
4    T       6.0  True
5    T      15.0  True
13   F       4.0  True
14   F       9.0  True
15   F       2.0  True
16   F      36.0  True

Upvotes: 1

How to delete all rows from a group, if a different column meets condition in pandas

Answers (3)

Related Questions