Calculating confidence levels of a distribution

Question

I have a set of data that looks like this:

data = [1,1,2,3,3,5,5]
point = 6

I do not know if this data is normal or anything else about its distribution. I would like to calculate with some confidence that point is significantly higher than the data.

I have tried:

import math
from scipy.stats import t
import numpy as np
confidence = (
    t.interval(0.95,len(data)-1,loc=np.mean(data),
    scale=np.std(data)/math.sqrt(len(data))))

This returns a lower and upper bound, but often when handling my actual data, the point I use appears to be significantly elevated, but is well below many of the numbers contained in data. Am I using the best test or is there something better?

Demetri Pananos · Accepted Answer

Formulae for confidence intervals can be find on wikipedia. Assuming you know the formula, and assuming the data is normal, you can do something like...

import numpy as np

data = np.array([1,1,2,3,3,5,5])
z_alpha = 1.96 #Change for appropriate alpha level.  I use standard normal, but you could use t-dist
n = data.size
m = data.mean()
s = data.std()
interval = m + z_alpha*s/np.sqrt(n)*np.array([-1,1])

# Point in interval?

point = 6

(point<=interval[1])&(point>=interval[0])

Calculating confidence levels of a distribution

Answers (1)

Related Questions