eugene
eugene

Reputation: 41665

How to calculate percentile

When you have a list of numbers

[1, 3,5, 7, 7, 9, 11, 11, 11, 24]

I want a list of percentiles

[10%,20%,30%, 40%, 40%, 60%, 70%, 70% 70%, 100%]

In plain python code,

percentiles = []

prev_value = None
prev_index = None
for value, index in enumerate(l):
   index_to_use = index + 1
   if prev_value == value:
       index_to_use = prev_index
  
   percentile = index_to_use / len(l) * 100
   percentiles.apend(percentile)

   if value != prev_value:
       prev_value = value
       prev_index = index

Can you do this simpler with numpy?

Upvotes: 1

Views: 698

Answers (3)

piRSquared
piRSquared

Reputation: 294218

Cute Numpy Trick

mylist = [1, 3, 5, 7, 7, 9, 11, 11, 11, 24]
unique_values, index, inverse = np.unique(mylist, return_index=True, return_inverse=True)
(index[inverse] + 1) / len(inverse) * 100

array([ 10.,  20.,  30.,  40.,  40.,  60.,  70.,  70.,  70., 100.])

Upvotes: 2

Mayank Porwal
Mayank Porwal

Reputation: 34046

You can also do this:

In [501]: arr = np.array([1, 3,5, 7, 7, 9, 11, 11, 11, 24])
In [504]: l = (arr - 1) / (np.max(arr) - 1) * 100

In [505]: l
Out[505]: 
array([  0.        ,   8.69565217,  17.39130435,  26.08695652,
        26.08695652,  34.7826087 ,  43.47826087,  43.47826087,
        43.47826087, 100.        ])

Upvotes: 1

abysslover
abysslover

Reputation: 788

You could use np.percentile as follows:

import numpy as np
if __name__ == '__main__':
    data = [1, 3,5, 7, 7, 9, 11, 11, 11, 24]
    percentiles = np.percentile(data, np.arange(10, 110, 10))
    print(percentiles)

Result:

[ 2.8  4.6  6.4  7.   8.   9.8 11.  11.  12.3 24. ]

Upvotes: 1

Related Questions