Lucas Aimaretto
Lucas Aimaretto

Reputation: 1489

Groupby and keep rows depending on string value

I have this DF:

In [106]: dfTest = pd.DataFrame( {'name':['a','a','b','b'], 'value':['x','y','x','h']})    
In [107]: dfTest
Out[107]: 
  name value
0    a     x
1    a     y
2    b     x
3    b     h

So my intention is to obtain one row per name group and the value to keep will depend. If for each group of name I find h in value, I'd like to keep it. Otherwise, any value would fit, such as:

In [109]: dfTest                                                                                         
Out[109]: 
  name value
0    a     x
1    b     h

Upvotes: 0

Views: 41

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150785

Another approach with drop_duplicates:

(dfTest.loc[dfTest['value'].eq('h').sort_values().index]
   .drop_duplicates('name', keep='last')
)

Output:

  name value
1    a     y
3    b     h

Upvotes: 1

Scott Boston
Scott Boston

Reputation: 153500

You can do it this way:

dfTest.reindex(dfTest.groupby('name')['value'].agg(lambda x: (x=='h').idxmax()))

Output:

      name value
value           
0        a     x
3        b     h

Upvotes: 2

Related Questions