Pandas: GroupBy and Order Groups based on max value in each group

Question

I have a Pandas DataFrame which contains tracks, score and some other columns.

I want to group by on "tracks" and then sorting of those groups based on the maximum value on the "score".

Example DataFrame:

tracks       score
20            2.2
20            1.5
25            3.5
24            1.2
24            5.5

Expected Output (I want to compare the highest values from each group and sort all the group from highest to lowest, though I don't want to lose any other data - means I want to show all the rows):

tracks       score
24            5.5
              1.2
25            3.5
20            2.2
              1.5

Currently, I am getting following output (my score is sorted but after group by my tracks are sorted based on the track numbers):

    tracks       score
20            2.2
              1.5
24            5.5
              4.2
25            3.5

My Approach till now: 1. I've sorted all the values by the score

sub_df = sub_df.sort_values("score")

Then I am doing the following to get the output (I need in dictionary format):

url_dict = sub_df.groupby('track')['url'].apply(list).to_dict()

I also tried to used OrderedDict but it is of no use (at least for now) as the groupBy command is sending the wrong sequence of data.

Pandas = 0.23, Python = 3.6.4

jezrael · Accepted Answer

Create helper column by GroupBy.transform and sort by multiple columns by DataFrame.sort_values, last remove helper column:

sub_df['max'] = sub_df.groupby('tracks')['score'].transform('max')

sub_df = sub_df.sort_values(["max","score"], ascending=False).drop('max', axis=1)
#if necessary sorting also by tracks column
#sub_df = sub_df.sort_values(["max","tracks","score"], ascending=False).drop('max', axis=1)
print (sub_df)
   tracks  score
4      24    5.5
3      24    1.2
2      25    3.5
0      20    2.2
1      20    1.5

Pandas: GroupBy and Order Groups based on max value in each group

Answers (2)

Related Questions