doyz
doyz

Reputation: 886

Python2.7: Subset dataframe based on condition in first row of groupby

I would like to subset a pandas dataframe based on a condition which only the first row in the groupby is subjected to.

Dataframe is to be grouped by "name", "driverRef", "tyre", "stint"

For eg, in the df below, because alonso started his stint 2 in position 12, i want to remove all of alonso's records from the df.

    name                   driverRef stint  tyre      lap   pos     
0   Australian Grand Prix   alonso  1.0     Super soft  1   9        
1   Australian Grand Prix   alonso  1.0     Super soft  2   9        
2   Australian Grand Prix   alonso  1.0     Super soft  3   9       
3   Australian Grand Prix   alonso  2.0     Super soft  20   12        
4   Australian Grand Prix   alonso  2.0     Super soft  21   11     
5   Australian Grand Prix   alonso  2.0     Super soft  22   10       

Expected output:

    name                   driverRef stint  tyre      lap   pos     
0   Australian Grand Prix   alonso  1.0     Super soft  2   9        
1   Australian Grand Prix   alonso  1.0     Super soft  3   9        
2   Australian Grand Prix   alonso  1.0     Super soft  4   9        

I tried this, but it doesn't implemenent the effect correctly:

df.loc[df.groupby(['name', 'driverRef', 'tyre', 'stint']).first().reset_index()['position'].isin(list(range(1,11))).index]

EDIT: My code does work, but please see @jezrael's answer for a more succint/better way of writing.

Upvotes: 1

Views: 41

Answers (1)

jezrael
jezrael

Reputation: 863166

You are really close, need transform for return Series with same length as original df:

s = df.groupby(['name', 'driverRef', 'tyre', 'stint'])['pos'].transform('first')
print (s)
0     9
1     9
2     9
3    12
4    12
5    12
Name: pos, dtype: int64

df = df[s.isin(list(range(1,11)))]
print (df)
                    name driverRef  stint        tyre  lap  pos
0  Australian Grand Prix    alonso    1.0  Super soft    1    9
1  Australian Grand Prix    alonso    1.0  Super soft    2    9
2  Australian Grand Prix    alonso    1.0  Super soft    3    9

Upvotes: 3

Related Questions