Reputation: 177
I was wondering if someone knows a good way on how to sort a pandas dataframe in the following way:
a) I have the following randomly sorted data with an id that appears multiple times and a label that is either 0 or 1:
id | label ------ | ------ 1 | 1 1 | 0 1 | 0 2 | 1 2 | 0 2 | 0 3 | 0 3 | 0 3 | 0
I would like to sort the labels in ascending order and then also sort the id's in ascending order, but not grouped, so like this:
id | label ------ | ------ 1 | 0 2 | 0 3 | 0 1 | 0 2 | 0 3 | 0 3 | 0 1 | 1 2 | 1
Thanks in advance!
Upvotes: 2
Views: 355
Reputation: 153460
First sort by id and label, then use cumcount to create an index representing 1,2,3 groups, then sort on index and by labels.
df_out = df.sort_values(by=['id','label'])\
.set_index(df.groupby('id').cumcount())\
.sort_index()\
.sort_values(by='label')
Output:
id label
0 1 0
0 2 0
0 3 0
1 1 0
1 2 0
1 3 0
2 3 0
2 1 1
2 2 1
Upvotes: 3