lighting
lighting

Reputation: 400

confidence interval based on sequentional sampling

When I want 100 sample sequentially from N(1,2) and estimate the means and std sequentially I can do it like:

sample = np.random.normal(1,2,(100, 1))
sample_mean = []
for i,_ in enumerate(sample):
    sample_mean.append(sample[:i+1,:].ravel().mean())
    sample_std.append(sample[:i+1,:].ravel().std())

But if I want to compute the confidence intervals for these sequential samples how can I do that?

Upvotes: 1

Views: 144

Answers (2)

KM_83
KM_83

Reputation: 727

One can calculate the CI of a sample mean as sampleMean +/- tstat * std/sqrt(n)

and CI of a sample standard deviation as a square root of ( (sampleSize-1)*std^2/chisq(a/2), (sampleSize-1)*std^2/chisq(1-a/2) ).

Here is an sample:

import numpy as np
import scipy.stats as stats
from scipy.stats import chi2

n_series = 100
sample = np.random.normal(0,1,n_series)
sample_mean, sample_std = [], []
sample_mean_CI, sample_std_CI = [], []
alpha = 0.1 # e.g. alpha = 0.1 for 90-percent CI, alpha=0.05 for 95-percent CI

def mean_div(std, n, alpha): return stats.t.isf(alpha/2, n) * (std/np.sqrt(n))
def mean_ci(xbar, std, n, alpha): 
    div = mean_div(std, n, alpha)
    return (xbar - div, xbar + div)

def std_lower(std, n, alpha): return np.sqrt(((n-1)*std**2)/chi2.isf(alpha/2, n-1))
def std_upper(std, n, alpha): return np.sqrt(((n-1)*std**2)/chi2.isf(1-alpha/2, n-1))
def std_ci(std, n, alpha): return (std_lower(std, n, alpha), std_upper(std, n, alpha))

for i,_ in enumerate(sample):
    x = sample[:i+1]
    xbar = x.mean()
    std = x.std()
    sample_mean.append(xbar)
    sample_std.append(std)
    sample_mean_CI.append(mean_ci(xbar, std, i+1, alpha))
    sample_std_CI.append(std_ci(std, i+1, alpha))
    
# check for convergence for the whole series
print(sample_mean_CI[-1])
print(sample_std_CI[-1])

One can change the number of data series and/or the mean and std parameters of normal distribution to experiment with it.

Upvotes: 1

josephmure
josephmure

Reputation: 228

You can do it with OpenTURNS. The following code computes asymptotic bilateral confidence intervals on the parameters (mean and standard deviation) with confidence level 0.9.

import numpy as np
import openturns as ot
sample = np.random.normal(1,2,(100, 1))
confidence_level = 0.9
sample_mean_ci = []
sample_std_ci = []
nor = ot.NormalFactory() # class that fits a Normal distribution on the data
for i in range(1, len(sample)):
    estimated_params = nor.buildEstimator(sample[:i+1,:]).getParameterDistribution()
    ci = estimated_params.computeBilateralConfidenceInterval(confidence_level)
    sample_mean_ci.append([ci.getLowerBound()[0], ci.getUpperBound()[0]])
    sample_std_ci.append([ci.getLowerBound()[1], ci.getUpperBound()[1]])

Upvotes: 0

Related Questions