bobby.dreamer
bobby.dreamer

Reputation: 396

Group multiple columns and rank within the group

Hi,

I am looking set ranking within group, not able to figure out how to do it.

Below is the test data, 
df = pd.DataFrame()

names = ['California','New York','California','New York', 'California','New York', 'California', 'California', 'California', 'California']
types = ['Student','Student','Student','Student','Student', 'Pleasure', 'Pleasure', 'Pleasure','Business', 'Business']

df['names'] = names
df['type'] = types

df.groupby(['names', 'type']).size().reset_index(name='counts')

Below is the output
    names       type        counts
0   California  Business    2
1   California  Pleasure    2
2   California  Student     3
3   New York    Pleasure    1
4   New York    Student     2

I would like to get the below output, Ranking is based on columns names and counts(desc). In California type Business & Pleasure has the same counts, for me it doesn't matter if the result rank is 2,3 or 2,2.

        names   type      counts Rank
0   California  Business  2      2      
1   California  Pleasure  2      2
2   California  Student   3      1
3   New York    Pleasure  1      2
4   New York    Student   2      1

Any ideas/solutions

Thanks

Upvotes: 1

Views: 110

Answers (1)

jezrael
jezrael

Reputation: 863541

Use GroupBy.rank with casting floats to integers by Series.astype:

df['Rank'] = df.groupby('names')['counts'].rank(ascending=False, method='dense').astype(int)
print(df)
        names      type  counts  Rank
0  California  Business       2     2
1  California  Pleasure       2     2
2  California   Student       3     1
3    New York  Pleasure       1     2
4    New York   Student       2     1

Upvotes: 1

Related Questions