Reputation: 59
For example i have df with such columns :
Area Prod Count
A A1 5
A A2 4
B B1 5
B B2 11
B B3 15
C C1 5
I want to apply sort based on occurrences, then sort by Count
For instance after sorting the df should look like this:
Area Prod Count
B B3 15
B B2 11
B B1 1
A A1 5
A A2 4
C C1 5
Since B
have 3 rows it is sort first, then inside B
the Prod
is sort by Count
I tried using df.groupby('area', 'prod').sort_values(by=('Count'), ascending=False))
didn't able to solve it. Thanks
Upvotes: 0
Views: 48
Reputation: 862541
One idea is create index by counts and then use it for sorting:
df.index = df['Area'].map(df['Area'].value_counts()).rename('idx')
print (df.sort_values(['idx','Count'], ascending=False))
Area Prod Count
idx
3 B B3 15
3 B B2 11
3 B B1 5
2 A A1 5
2 A A2 4
1 C C1 5
df = df.sort_values(['idx','Count'], ascending=False).reset_index(drop=True)
print (df)
Area Prod Count
0 B B3 15
1 B B2 11
2 B B1 5
3 A A1 5
4 A A2 4
5 C C1 5
But problem should be if there is multiple same counts, added D
group with length 3
:
print (df)
Area Prod Count
0 A A1 5
1 A A2 4
2 B B1 5
3 B B2 11
4 B B3 15
5 C C1 5
6 D A1 50
7 D A2 40
8 D B1 5
Then create index by counts:
df.index = df['Area'].map(df['Area'].value_counts()).rename('idx')
print (df)
Area Prod Count
idx
2 A A1 5
2 A A2 4
3 B B1 5
3 B B2 11
3 B B3 15
1 C C1 5
3 D A1 50
3 D A2 40
3 D B1 5
But if sorting like first solution are mixed rows:
print (df.sort_values(['idx','Count'], ascending=False))
Area Prod Count
idx
3 D A1 50
3 D A2 40
3 B B3 15
3 B B2 11
3 B B1 5
3 D B1 5
2 A A1 5
2 A A2 4
1 C C1 5
Solution is add column Area
for sorting, e.g. all descending:
print (df.sort_values(['idx','Area','Count'], ascending=False))
Area Prod Count
idx
3 D A1 50
3 D A2 40
3 D B1 5
3 B B3 15
3 B B2 11
3 B B1 5
2 A A1 5
2 A A2 4
1 C C1 5
Or you can sort Area
by ascending:
print (df.sort_values(['idx','Area','Count'], ascending=[False, True, False]))
Area Prod Count
idx
3 B B3 15
3 B B2 11
3 B B1 5
3 D A1 50
3 D A2 40
3 D B1 5
2 A A1 5
2 A A2 4
1 C C1 5
Upvotes: 1