SirTobi
SirTobi

Reputation: 177

Pandas Dataframe ordering and sorting of column values

I was wondering if someone knows a good way on how to sort a pandas dataframe in the following way:

a) I have the following randomly sorted data with an id that appears multiple times and a label that is either 0 or 1:

id | label
------ | ------ 
1 | 1
1 | 0
1 | 0
2 | 1
2 | 0
2 | 0
3 | 0
3 | 0
3 | 0

I would like to sort the labels in ascending order and then also sort the id's in ascending order, but not grouped, so like this:

id | label
------ | ------ 
1 | 0
2 | 0
3 | 0
1 | 0
2 | 0
3 | 0
3 | 0
1 | 1
2 | 1

Thanks in advance!

Upvotes: 2

Views: 355

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

First sort by id and label, then use cumcount to create an index representing 1,2,3 groups, then sort on index and by labels.

df_out = df.sort_values(by=['id','label'])\
  .set_index(df.groupby('id').cumcount())\
  .sort_index()\
  .sort_values(by='label')

Output:

   id  label
0   1      0
0   2      0
0   3      0
1   1      0
1   2      0
1   3      0
2   3      0
2   1      1
2   2      1

Upvotes: 3

Related Questions