No_body
No_body

Reputation: 842

Get all rows after the last occurrence of a specific value in pandas

My dataframe looks like

ID  colA  
1      B     
1      D     
2      B    
2      D     
2      C     

I have return all rows after the last occurrence of event B in each group. The output will be :

ID  colA   
1      D   
2      D     
2      C  

I tried

a = df['colA'].str.contains('B').groupby(df['ID'])
b = df[(a.transform('sum') - a.cumsum()).eq(0)]

and it's working fine so far. I am just wondering if there is any alternative approach to achieve this?

Upvotes: 4

Views: 1638

Answers (3)

yatu
yatu

Reputation: 88236

You could do:

ix = (df.colA.eq('B')
        .cumsum()
        .groupby(df.ID)
        .apply(lambda x: x.loc[x.idxmax()+1:]).index.get_level_values(1))

df.loc[ix,:]

    ID colA
1   1    D
3   2    D
4   2    C

Upvotes: 3

cs95
cs95

Reputation: 402383

Reverse your rows (this is important). Then call groupby and cumsum, and take all rows with (reversed) cumsum value equal to zero.

df[df.colA.eq('B')[::-1].astype(int).groupby(df.ID).cumsum().eq(0)]

   ID colA
1   1    D
3   2    D
4   2    C

Upvotes: 7

BENY
BENY

Reputation: 323226

IIUC

def yourlast(x):
        return x.loc[x.colA.where(df.colA.eq('B')).last_valid_index()+1:]
df.groupby('ID').apply(yourlast)
Out[163]: 
      ID colA
ID           
1  1   1    D
2  3   2    D
   4   2    C

Upvotes: 2

Related Questions