Finding the maximum value in a group with differentiation

Question

I have a Pandas DataFrame that looks like this:

index	ID	value_1	value_2
0	1	200	126
1	1	200	127
2	1	200	128.1
3	1	200	125.7
4	2	300.1	85
5	2	289.4	0
6	2	0	76.9
7	2	199.7	0

My aim is to find all rows in each ID-group (1,2 in this example) which have the max value for value_1 column. The second condition is if there are multiple maximum values per group, the row where the value in column value_2 is maximum should be taken.

So the target table should look like this:

index	ID	value_1	value_2
0	1	200	128.1
1	2	300.1	85

jezrael · Accepted Answer

Use DataFrame.sort_values by all 3 columns and then DataFrame.drop_duplicates:

df1 = (df.sort_values(['ID', 'value_1', 'value_2'], ascending=[True, False, False])
         .drop_duplicates('ID'))
print (df1)
   ID  value_1  value_2
2   1    200.0    128.1
4   2    300.1     85.0

Finding the maximum value in a group with differentiation

Answers (1)

Related Questions