Robert T. McGibbon
Robert T. McGibbon

Reputation: 5277

Interval containing specified percent of values

With numpy or scipy, is there any existing method that will return the endpoints of an interval which contains a specified percent of the values in a 1D array? I realize that this is simple to write myself, but it seems like the kind of thing that might be built in, although I can't find it.

E.g:

>>> import numpy as np
>>> x = np.random.randn(100000)
>>> print(np.bounding_interval(x, 0.68))

Would give approximately (-1, 1)

Upvotes: 0

Views: 138

Answers (2)

Warren Weckesser
Warren Weckesser

Reputation: 114841

You can use np.percentile:

In [29]: x = np.random.randn(100000)

In [30]: p = 0.68

In [31]: lo = 50*(1 - p)

In [32]: hi = 50*(1 + p)

In [33]: np.percentile(x, [lo, hi])
Out[33]: array([-0.99206523,  1.0006089 ])

There is also scipy.stats.scoreatpercentile:

In [34]: scoreatpercentile(x, [lo, hi])
Out[34]: array([-0.99206523,  1.0006089 ])

Upvotes: 3

philE
philE

Reputation: 1703

I don't know of a built-in function to do it, but you can write one using the math package to specify approximate indices like this:

from __future__ import division
import math
import numpy as np

def bound_interval(arr_in, interval):
    lhs = (1 - interval) / 2  # Specify left-hand side chunk to exclude
    rhs = 1 - lhs  # and the right-hand side
    sorted = np.sort(arr_in)
    lower = sorted[math.floor(lhs * len(arr_in))]  # use floor to get index
    upper = sorted[math.floor(rhs * len(arr_in))]
    return (lower, upper)

On your specified array, I got the interval (-0.99072237819851039, 0.98691691784955549). Pretty close to (-1, 1)!

Upvotes: 0

Related Questions