Alexall
Alexall

Reputation: 423

calculate 95 percentile on multiple lists

I want to calculate 95 percentile on multiple lists like this :

import numpy as np
list_a = [0, 5, 10, 2, 3, 5]
list_b = [0, 0, 6, 5, 4, 4]

list_t = []
list_t.extend(list_a)
list_t.extend(list_b)
list_t_pc = np.percentile(list_t, 95)
print('percentile of list total percentile : '+str(list_t_pc))

Output :

percentile of list total : 7.799999999999997

But in my case I collect data day by day and i can't save all values in list. I can save one value by day (like a mean, max or a percentil). So to get the nearest value of the 95 percentile I do :

import numpy as np
day_1 = [0, 5, 10, 2, 3, 5]
day_2 = [0, 0, 6, 5, 4, 4]

day_1_pc = np.percentile(day_1, 95)
day_2_pc = np.percentile(day_2, 95)

list_pc = [day_1_pc, day_2_pc]
print('percentile of percentile : '+str(np.percentile(list_pc, 95)))

Output :

percentile of percentile : 8.6

Is there a way to calculate a closer value ?

Upvotes: 1

Views: 976

Answers (1)

Michael
Michael

Reputation: 104

There are different ways to calculate the percentile. Perhaps it helps if you consider the way the percentiles are calculated in numpy. Here is the description of the method https://en.wikipedia.org/wiki/Percentile#The_linear_interpolation_between_closest_ranks_method

I wrote a simple program to mimic the algorithm used.

#! -*- coding: utf-8 -*-
import sys
import math

def percentile(inList,value):
    sList=sorted(inList)
    x=len(sList)
    rank=(value/100.0*(x-1))+1
    frac,whole=math.modf(rank)
    a=sList[int(whole)-1]
    b=sList[int(whole)]
    c=frac*(b-a)
    p=a+c
    return p


list_a = [0, 5, 10, 2, 3, 5]
list_b = [0, 0, 6, 5, 4, 4]

print percentile(list_a,95)
print percentile(list_b,95)
print percentile(list_a+list_b,95)
listc=[8.75,5.75]
print percentile(listc,95)

Upvotes: 1

Related Questions