Reputation: 962
From a dataframe df
I want to update the value of a column Points
for the top 3 values of another column Time
after sorting the Time
column in ascending order, such that
df['Points'] = df['Points'] * 1.3
for the first row (smallest Time
)
df['Points'] = df['Points'] * 1.2
for the second row (second smallest Time
)
df['Points'] = df['Points'] * 1.1
for the third row (third smallest Time
) rounded to the nearest integer.
and Points
for all other rows remains the same.
I have to do this for every unique value for a third column value Challenge
. How can I do this?
So, I need PointsA
instead of Points
from below -
Challenge Team Time Points PointsA
A 1 2019-11-05 23:00:43.07589 200 260
B 3 2019-11-05 22:10:55.07589 100 130
A 5 2019-11-05 23:05:43.07589 200 240
A 7 2019-11-05 23:07:33.07589 200 220
B 10 2019-11-05 22:20:13.07589 100 120
C 4 2019-11-06 00:05:22.07589 50 65
A 4 2019-11-05 23:18:23.07589 200 200
I've tried something like -
for challenge in df['Challenge'].unique():
df[df['Challenge'] == challenge].sort_values('Time', ascending=True).head(1)['Points'] *= 1.3
but that doesn't seem to work.
Upvotes: 2
Views: 298
Reputation: 25239
Try this. Use value_counts
and items
to get each challenge
and length of them. Use these length to narrow on assignment of challenge
val = [1.3, 1.2, 1.1]
df.Time = pd.to_datetime(df.Time)
for challenge, i in df['Challenge'].value_counts().items():
df.loc[df[df['Challenge'] == challenge].nsmallest(3, 'Time').index, 'Points'] *= val[:i]
Out[201]:
Challenge Team Time Points PointsA
0 A 1 2019-11-05 23:00:43.075890 260.0 260
1 B 3 2019-11-05 22:10:55.075890 130.0 130
2 A 5 2019-11-05 23:05:43.075890 240.0 240
3 A 7 2019-11-05 23:07:33.075890 220.0 220
4 B 10 2019-11-05 22:20:13.075890 120.0 120
5 C 4 2019-11-06 00:05:22.075890 65.0 65
6 A 4 2019-11-05 23:18:23.075890 200.0 200
As Challenge = 'C'
has one row and it got calculated correctly from 50
to 65
Upvotes: 1
Reputation: 4284
Here is a way to do it
import pandas as pd
import numpy as np
# compute rank by challenge
df['rank_in_challenge'] = df.groupby('Challenge')['Time'].rank(method='first',ascending=True).astype('int')
# apply change in points
conditions = [ df['rank_in_challenge']==1,df['rank_in_challenge']==2,df['rank_in_challenge']==3]
choices = [ 1.3, 1.2, 1.1 ]
df["PointsA"] = np.select(conditions, choices, default=1.0)*df['Points']
Upvotes: 1