Reputation: 396
Hi,
I am looking set ranking within group, not able to figure out how to do it.
Below is the test data,
df = pd.DataFrame()
names = ['California','New York','California','New York', 'California','New York', 'California', 'California', 'California', 'California']
types = ['Student','Student','Student','Student','Student', 'Pleasure', 'Pleasure', 'Pleasure','Business', 'Business']
df['names'] = names
df['type'] = types
df.groupby(['names', 'type']).size().reset_index(name='counts')
Below is the output
names type counts
0 California Business 2
1 California Pleasure 2
2 California Student 3
3 New York Pleasure 1
4 New York Student 2
I would like to get the below output, Ranking is based on columns names and counts(desc). In California type Business & Pleasure has the same counts, for me it doesn't matter if the result rank is 2,3 or 2,2.
names type counts Rank
0 California Business 2 2
1 California Pleasure 2 2
2 California Student 3 1
3 New York Pleasure 1 2
4 New York Student 2 1
Any ideas/solutions
Thanks
Upvotes: 1
Views: 110
Reputation: 863541
Use GroupBy.rank
with casting floats to integers by Series.astype
:
df['Rank'] = df.groupby('names')['counts'].rank(ascending=False, method='dense').astype(int)
print(df)
names type counts Rank
0 California Business 2 2
1 California Pleasure 2 2
2 California Student 3 1
3 New York Pleasure 1 2
4 New York Student 2 1
Upvotes: 1