Reputation: 59
I have a data of 10 values that I would like to assign them according to its percentage.
My data:
stock value
s_001 -0.001932
s_002 0.004001
s_003 0.001323
s_004 -0.006785
s_005 0.004405
s_006 -0.002872
s_007 0.003101
s_008 0.001383
s_009 -0.004785
s_010 0.001405
Percentiles:
breakpoints = [0, 20, 40, 60, 80]
I used df.sort_values to sort the values according to chronological order:
stock value
s_001 -0.001932
s_006 -0.002872
s_009 -0.004785
s_004 -0.006785
s_003 0.001323
s_008 0.001383
s_010 0.001405
s_007 0.003101
s_002 0.004001
s_005 0.004405
After sorting, how can I assign the first two values to the first percentile, then the next two to the second percentile and so on?
Upvotes: 0
Views: 46
Reputation: 260360
You can use pandas.qcut
. You will need the breakpoints as numbers between 0 and 1:
breakpoints = [0. , 0.2, 0.4, 0.6, 0.8]
df['quantile'] = pd.qcut(df['value'],
breakpoints+[1],
labels=[int(i*100) for i in breakpoints]
)
NB. the dataframe needs not to be sorted for this
output:
stock value quantile
0 s_001 -0.001932 20
1 s_002 0.004001 80
2 s_003 0.001323 40
3 s_004 -0.006785 0
4 s_005 0.004405 80
5 s_006 -0.002872 20
6 s_007 0.003101 60
7 s_008 0.001383 40
8 s_009 -0.004785 0
9 s_010 0.001405 60
Upvotes: 3