Reputation: 423
I want to calculate 95 percentile on multiple lists like this :
import numpy as np
list_a = [0, 5, 10, 2, 3, 5]
list_b = [0, 0, 6, 5, 4, 4]
list_t = []
list_t.extend(list_a)
list_t.extend(list_b)
list_t_pc = np.percentile(list_t, 95)
print('percentile of list total percentile : '+str(list_t_pc))
Output :
percentile of list total : 7.799999999999997
But in my case I collect data day by day and i can't save all values in list. I can save one value by day (like a mean, max or a percentil). So to get the nearest value of the 95 percentile I do :
import numpy as np
day_1 = [0, 5, 10, 2, 3, 5]
day_2 = [0, 0, 6, 5, 4, 4]
day_1_pc = np.percentile(day_1, 95)
day_2_pc = np.percentile(day_2, 95)
list_pc = [day_1_pc, day_2_pc]
print('percentile of percentile : '+str(np.percentile(list_pc, 95)))
Output :
percentile of percentile : 8.6
Is there a way to calculate a closer value ?
Upvotes: 1
Views: 976
Reputation: 104
There are different ways to calculate the percentile. Perhaps it helps if you consider the way the percentiles are calculated in numpy. Here is the description of the method https://en.wikipedia.org/wiki/Percentile#The_linear_interpolation_between_closest_ranks_method
I wrote a simple program to mimic the algorithm used.
#! -*- coding: utf-8 -*-
import sys
import math
def percentile(inList,value):
sList=sorted(inList)
x=len(sList)
rank=(value/100.0*(x-1))+1
frac,whole=math.modf(rank)
a=sList[int(whole)-1]
b=sList[int(whole)]
c=frac*(b-a)
p=a+c
return p
list_a = [0, 5, 10, 2, 3, 5]
list_b = [0, 0, 6, 5, 4, 4]
print percentile(list_a,95)
print percentile(list_b,95)
print percentile(list_a+list_b,95)
listc=[8.75,5.75]
print percentile(listc,95)
Upvotes: 1