John Doe
John Doe

Reputation: 105

drop entire pandas group based on condition

I have this df:

       Plate       Route        Speed       Dif     hour
0      0660182      RUTA 66     10          NaN     13
1      939CH001M    RUTA 51     22          33.0    13
2      596NZ008M    RUTA 102    0           34.0    13
3      0790024      RUTA 79     0           33.0    13
4      947CH045M    RUTA 50     28          33.0    13
...     ...     ...     ...     ...     ...
1279634 0120414     RUTA 12     0           NaN     5
1279635 1090016     200826      0           NaN     5
1279636 0350144     RUTA 35     0           NaN     5
1279637 006908      RUTA 106    0           NaN     5
1279638 0340071     RUTA 34     1           NaN     5

I want to filter plate groups (there are many registers per plate) whenever 'Dif' is 60 or more. So, using a post from here, I tried:

f = lambda df: not df[df['Dif'] < 60].empty
filtered = df.groupby('Plate').filter(f)

and I still get Dif values over 60 for this groupby. What is wrong here? Or else, how can I do this filter using groupby? (tried with greater than, in case my logic is failing, and still cant get it).

I appreciate your help.

Upvotes: 1

Views: 24

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150745

You can use groupby.transform:

# only select groups whose `Dif` are all < 60
mask = (df['Dif']<60).groupby(df['Plate']).transform('all')
df[mask]

Or your function:

f = lambda df: (df['Dif']<60).all()
filtered = df.groupby('Plate').filter(f)

Note that your orginal function will keep all the groups that have at least one Dif < 60, instead of all.

Upvotes: 1

Related Questions