Reputation: 11
I don't understand how summarySE() in Rmisc package calculates the confidence intervals (ci) of my data. The values do not seem to be correct.
For example, after running summarySE(data = df, measurevar = "numbers", groupvars = "conditions", conf.interval = 0.95)
, the output shows:
conditions N numbers sd se ci
1 constructionA 10 6.025 0.3987829 0.1261062 0.2852721
2 constructionB 10 1.925 0.3545341 0.1121135 0.2536184
However, the confidence interval of constructionA should be 6.025 ± 1.96 x (0.398729)/√10, which should be 6.025 ± 0.24716366. I don't understand where the value of 0.2852721 comes from after applying summarySE... Shouldn't it be 0.24716366?
Could anyone tell me what's wrong here?
Thank you!
Upvotes: 1
Views: 389
Reputation: 1233
A common construction of a confidence interval is
(statistic) +/- c*(standard error of statistic)
where c is the critical value. c=1.96 is (approximately) the critical value you get for a normally-distributed z-statistic and a 95% confidence interval, but that's not part of the definition of a CI or anything, it's just the CI you get if you think your statistic is normally distributed.
However, most calculations of confidence intervals, summarySE()
included, use the t-distribution rather than the normal distribution to calculate critical values, as they produce more accurate results than the normal when sample sizes are small (and nearly identical results when they are large).
Here, your sample size is only N=10, so the differences between the normal-distribution 1.96 and the critical value from the t-statistic are noticeable. The 2.5th percentile of the t distribution with 10-1 = 9 degrees of freedom is qt(.025, 9) =
-2.262157. So c = 2.262157 for a two-sided 95% confidence interval.
0.1261062*2.262157 = 0.285272, and this is where the confidence interval column comes from.
Upvotes: 2