Reputation: 21
I have a data frame and I need to sort it samples by several columns. For example, having the below data frame, it needs to sort by the maximum price and maximum horse-power for each different company.
company price horse-power
toyota 20000 100
toyota 20000 125
bmw 40000 150
volkswagen 33000 130
bmw 45000 200
toyota 20000 120
audi 30000 150
volkswagen 32000 110
audi 35000 180
volkswagen 33000 135
the output should be:
company price horse-power
volkswagen 33000 135
bmw 45000 200
toyota 20000 125
audi 35000 180
I have a data frame and I need to sort it samples by several columns. For example, having the below data frame, it needs to sort by the maximum price and maximum horse-power for each different company.
company price horse-power
toyota 20000 100
toyota 20000 125
bmw 40000 150
volkswagen 33000 130
bmw 45000 200
toyota 20000 120
audi 30000 150
volkswagen 32000 110
audi 35000 180
volkswagen 33000 135
the output should be:
company price horse-power
volkswagen 33000 135
bmw 45000 200
toyota 20000 125
audi 35000 180
Upvotes: 0
Views: 32
Reputation: 393
If your dataset contains more columns than these and you only want to keep the max of a subset of the columns, do it this way:
subset_to_max = ['price', 'horse-power']
df.sort_values(subset_to_max, ascending=False).drop_duplicates(['company'], keep='first')
Upvotes: 0
Reputation: 11650
groupby with a max
df.groupby('company',as_index=False).max()
company price horse-power
0 audi 35000 180
1 bmw 45000 200
2 toyota 25000 125
3 volkswagen 33000 130
Upvotes: 1