Reputation: 101
I have a below list.
33, 26, 24, 21, 19, 20, 18, 18, 52, 56, 27, 22, 18, 49, 22, 20, 23, 32, 20, 18
All I am trying is to find the the 25th Percentile.
I used simple numpy program to find it.
import numpy as np
arr = [33, 26, 24, 21, 19, 20, 18, 18, 52, 56, 27, 22, 18, 49, 22, 20, 23, 32, 20, 18]
np.percentile(arr,25)
Output is : 19.75
But If we count is manually or Use Excel the 25th percentile comes as 19.25.
I expect the output as 19.25 but the actual output from numpy is 19.75. Can someone please help what is wrong here?
Upvotes: 1
Views: 2852
Reputation: 545
You see, in excel there's two percentile function: PERCENTILE.EXC
and PERCENTILE.INC
and the difference is in "the Percentile.Inc
function the value of k is is within the range 0 to 1 inclusive, and in the Percentile.Exc
function, the value of k is within the range 0 to 1 exclusive." (source)
Numpy's percentile
function computes the k'th percentile where k must be between 0 and 100 inclusive (docs)
Let's check that.
arr = [18, 18, 18, 18, 19, 20, 20, 20, 21, 22, 22, 23, 24, 26, 27, 32, 33, 49, 52, 56]
np.percentile(arr,25)
19.75
Hope that helps
Upvotes: 2
Reputation: 2332
Check your input values, and lookup what excel uses, since these are the options in numpy
t = ['linear', 'lower', 'higher', 'nearest', 'midpoint']
arr = np.array([33, 26, 24, 21, 19, 20, 18, 18, 52, 56, 27, 22, 18, 49, 22, 20, 23, 32, 20, 18])
for cnt, i in enumerate(t):
v = np.percentile(arr, 25., interpolation=i)
print("type: {} value: {}".format(i, v))
type: linear value: 19.75
type: lower value: 19
type: higher value: 20
type: nearest value: 20
type: midpoint value: 19.5
Upvotes: 1