nonegiven72
nonegiven72

Reputation: 427

Pandas, selecting first oberservation of a group and sort

New to pandas and trying to get better. I would like to group_by the "id" column, then order_by "col2" then "col3", then take the top observation. Thanks in advance

  In[5]:df
  Out[5]:
            id         col2         col3
  0         1           2            5 
  1         1           2            2 
  2         2           8            3 
  3         2           3            6  
  4         2           3            1 
  5         1           3            7 
  6         2           5            6 
  7         1           8            6

Desired:

  In[6]:df_selection
  Out[6]:
            id         col2         col3
  0         1           2            2 
  1         2           3            1 

Upvotes: 1

Views: 37

Answers (1)

Ami Tavory
Ami Tavory

Reputation: 76297

A very short way to do this would be to sort by these columns, group, and take the first of each group:

>>> df.sort_values(by=['col2', 'col3']).groupby('id').first()
    col2    col3
id      
1   2   2
2   3   1

To get the exact output as your question, just reset the index:

>>> df.sort_values(by=['col2', 'col3']).groupby('id').first().reset_index()
    id  col2    col3
0   1   2   2
1   2   3   1

Upvotes: 1

Related Questions