Reputation: 616
I have the following dataframe:
ID Name Weight Score
1 Amazon 2 11
1 Apple 4 10
1 Netflix 1 10
2 Amazon 2 8
2 Apple 4 8
2 Netflix 1 5
Currently I have a code which looks like this
#add weight and score column
df['Rank'] = df['Weight'] + df['Score']
#create score rank on ID column
df['Score_Rank'] = df.groupby('ID')['Rank'].rank("first", ascending = False)
This code does not give me exactly what I want.
I would like to first rank on Score, without including the weight. And then break any ties in the rank by adding weight column to break them. If there are further ties after weight column has been added, then rank would be by random selection.
I think an if statement could work in this scenario, just not sure how.
Expected output:
ID Name Weight Score Score_Rank
1 Amazon 2 11 1
1 Apple 4 10 2
1 Netflix 1 10 3
2 Amazon 2 8 2
2 Apple 4 8 1
2 Netflix 1 5 3
Upvotes: 3
Views: 2756
Reputation: 153460
You can use rank
with method='first'
with some presorting first:
df['Score_Rank'] = (df.sort_values('Weight', ascending=False)
.groupby(['ID'])['Score']
.rank(method='first', ascending=False)
)
Output:
ID Name Weight Score Score_Rank
0 1 Amazon 2 11 1.0
1 1 Apple 4 10 2.0
2 1 Netflix 1 10 3.0
3 2 Amazon 2 8 2.0
4 2 Apple 4 8 1.0
5 2 Netflix 1 5 3.0
Details:
First, sort your dataframe by weights descending, then use rank with method first on Score which will break ties based on sort order of the dataframe. And because, pandas does intrinsic data alignment, assign to new column 'Score_Rank' yeilds the based on original order of the dataframe.
Upvotes: 6
Reputation: 150745
Try with cumcount
:
df['Score_Rank'] = (df.sort_values(['Score','Weight'])
.groupby(['ID']).cumcount(ascending=False)+1
)
Output:
ID Name Weight Score Score_Rank
0 1 Amazon 2 11 1
1 1 Apple 4 10 2
2 1 Netflix 1 10 3
3 2 Amazon 2 8 2
4 2 Apple 4 8 1
5 2 Netflix 1 5 3
Upvotes: 4