Reputation: 519
I have the following dataframe:
User_ID Game_ID votes
1 11 1040
1 11 nan
1 22 1101
1 11 540
1 33 nan
2 33 nan
2 33 290
2 33 nan
Based on the percentile of the values in the column votes
, a new column needs to be created, per the following rules:
If the “votes” value is >= 75th percentile assign a score of 2
If >=25th percentile assign a score of 1
If <25th percentile assign a score of 0.
Upvotes: 3
Views: 537
Reputation: 434
You can get the percentiles by calling describe and use list comprehension:
percentiles = df.votes.describe()
df['scores'] = [2 if x >= percentiles['75%'] else (0 if x < percentiles['25%'] else 1) for x in df.votes]
Upvotes: 2
Reputation: 19885
Use pd.qcut
:
df['score'] = pd.qcut(df['votes'].astype(float), [0, 0.25, 0.75, 1.0]).cat.codes
print(df)
Output (nan
corresponds to -1
):
0 1
1 -1
2 2
3 1
4 -1
5 -1
6 0
7 -1
dtype: int8
Upvotes: 2