Reputation: 326
I have a dataframe with two columns. One is numeric and the other is categorical. For example,
c1 c2
0 15 A
1 11 A
2 12 B
3 40 C
I want to sort by c1 but keep rows with the same c2 value together (so all the A's stay together). In categories where there are multiple entries, we sort by the largest value in that category.
So end result would be
c1 c2
0 40 C
1 15 A
2 11 A
3 12 B
How should I do this? Thanks
Upvotes: 1
Views: 1943
Reputation: 35676
We can create a temp column withgroupby transform
max
to get the max value per group sort_values
with ascending False
then drop
the added column.
df = (
df.assign(key=df.groupby('c2')['c1'].transform('max'))
.sort_values(['key', 'c2', 'c1'], ascending=False, ignore_index=True)
.drop(columns=['key'])
)
df
:
c1 c2
0 40 C
1 15 A
2 11 A
3 12 B
Upvotes: 6
Reputation: 14949
IIUC, you can try:
df = (
df.sort_values(by='c1', ascending=False)
.groupby('c2', as_index=False, sort=False)
.agg(list)
.explode('c1')
)
Upvotes: 1