Reputation: 5759
I have a set of data that looks like this:
data = [1,1,2,3,3,5,5]
point = 6
I do not know if this data is normal or anything else about its distribution. I would like to calculate with some confidence that point
is significantly higher than the data
.
I have tried:
import math
from scipy.stats import t
import numpy as np
confidence = (
t.interval(0.95,len(data)-1,loc=np.mean(data),
scale=np.std(data)/math.sqrt(len(data))))
This returns a lower and upper bound, but often when handling my actual data, the point
I use appears to be significantly elevated, but is well below many of the numbers contained in data
. Am I using the best test or is there something better?
Upvotes: 1
Views: 149
Reputation: 7404
Formulae for confidence intervals can be find on wikipedia. Assuming you know the formula, and assuming the data is normal, you can do something like...
import numpy as np
data = np.array([1,1,2,3,3,5,5])
z_alpha = 1.96 #Change for appropriate alpha level. I use standard normal, but you could use t-dist
n = data.size
m = data.mean()
s = data.std()
interval = m + z_alpha*s/np.sqrt(n)*np.array([-1,1])
# Point in interval?
point = 6
(point<=interval[1])&(point>=interval[0])
Upvotes: 1