The Nightman
The Nightman

Reputation: 5759

Calculating confidence levels of a distribution

I have a set of data that looks like this:

data = [1,1,2,3,3,5,5]
point = 6

I do not know if this data is normal or anything else about its distribution. I would like to calculate with some confidence that point is significantly higher than the data.

I have tried:

import math
from scipy.stats import t
import numpy as np
confidence = (
    t.interval(0.95,len(data)-1,loc=np.mean(data),
    scale=np.std(data)/math.sqrt(len(data))))

This returns a lower and upper bound, but often when handling my actual data, the point I use appears to be significantly elevated, but is well below many of the numbers contained in data. Am I using the best test or is there something better?

Upvotes: 1

Views: 149

Answers (1)

Demetri Pananos
Demetri Pananos

Reputation: 7404

Formulae for confidence intervals can be find on wikipedia. Assuming you know the formula, and assuming the data is normal, you can do something like...

import numpy as np

data = np.array([1,1,2,3,3,5,5])
z_alpha = 1.96 #Change for appropriate alpha level.  I use standard normal, but you could use t-dist
n = data.size
m = data.mean()
s = data.std()
interval = m + z_alpha*s/np.sqrt(n)*np.array([-1,1])

# Point in interval?

point = 6

(point<=interval[1])&(point>=interval[0]) 

Upvotes: 1

Related Questions