Reputation: 60
I'm trying to calculate confidence intervals from a t test in R manually and I suspect the way i calculate them are off.
Here's how I calculate confidence intervals manually right now
library(broom)
data("mtcars")
a1=tidy(t.test(mpg ~ am, mtcars))
mean_diff<-a1$estimate
tvalue <-a1$statistic
#standard error
sd1=sd(mtcars$mpg[mtcars$am==0])
sd2=sd(mtcars$mpg[mtcars$am==1])
n1=length(mtcars$mpg[mtcars$am==0])
n2=length(mtcars$mpg[mtcars$am==1])
#formula for standard error
stan_error=sqrt((sd1/n1)+(sd2/n2))
then I take the formula from this page about calculating the confidence intervals http://onlinestatbook.com/2/estimation/difference_means.html
The lower confidence interval I calculate like this
lower=mean_diff - (tvalue * stan_error)'
and the result comes out to be -4.147333
But the confidence intervals of
t.test(mpg ~ am, mtcars)
are
95 percent confidence interval:
-11.280194 -3.209684
Any ideas?
Upvotes: 3
Views: 3625
Reputation: 47
A more generalized approach can be taken for limited size samples or for cases where an a-priori assumption of normality can't be made, and thus a sampling distribution model (e.g.student's) may not be available or suitable. [1,2].
Start by loading the boot R package (version 1.3-24)
RStudio Session Information Created: Tue May 12 10:20:56 2020 R version:4.0.0 Nickname: Arbor Day Platform: x86_64-w64-mingw32
library(boot)
idx<-c(1:32)
diff.means <- function(d,idx)
{
m1 = mean(subset(d[idx, 1], d[idx, 9] == 1),na.rm=TRUE)
m2 = mean(subset(d[idx, 1], d[idx, 9] == 0),na.rm=TRUE)
gp1
ss1=sum(subset(d[, 1], d[, 9] == 1)-m1)^2
ss2=sum(subset(d[, 1], d[, 9] == 0)-m2)^2
m= (m2-m1)
v=(ss1 + ss2)/(max(idx) - 2)
c(m,v)
}
d<-mtcars
mpg.boot<- boot( data=mtcars,diff.means,R=100)
vt0 <- mpg.boot$t0[2]/mpg.boot$t0[1]^2
vt<-mpg.boot$t[, 2]/mpg.boot$t[ ,1]^2
mpg.ci<-boot.ci(mpg.boot, statistics= d,type = c("all"))
mpg.ci$t0
mpg.ci$normal[2]
mpg.ci$basic[4]
mpg.ci$percent[4]
mpg.ci$bca[4]
mpg.ci$student[4]
With just 100 resampling rounds (R=100) we can see that the four methods yield lower interval values that are similar among them and comparable to the student's distribution method:
The value of R is somewhat arbitrary and limited by execution speed requirements. Higher R values do not create new data. This simply shows how statistics would behave if a larger sample was available.
plot(mpg.boot)
Historgram and q-q- plot of resmapled distribution
1: Bruce,P, Bruce, A., Practical Statistics for Data Scientists,O'Reilly, First Edition, 2017, pp. 57-60
r bootstrap confidences resampling
Upvotes: 1
Reputation: 2318
Firstly, critical value for t is not right.
tvalue <- a1$statistic
needs to be replaced with tvalue <- abs(qt(0.05/2, 30))
.
Note its not 32 because we lose 2 degrees of freedom.
And you are missing ^2
(i.e. to the power of two) in formula for standard error.
What you have in sd1
and sd2
are standard errors so you need to convert this into variances.
So correct formula is:
stan_error = sqrt((sd1^2 / n1) + (sd2^2 / n2))
So the new code becomes:
library(broom)
data("mtcars")
a1=tidy(t.test(mpg ~ am, mtcars))
mean_diff<-a1$estimate
t_cv<- abs(qt(0.05/2, 30))
#standard error
sd1=sd(mtcars$mpg[mtcars$am==0])
sd2=sd(mtcars$mpg[mtcars$am==1])
n1=length(mtcars$mpg[mtcars$am==0])
n2=length(mtcars$mpg[mtcars$am==1])
#formula for standard error
stan_error = sqrt((sd1^2 / n1) + (sd2^2 / n2))
lower=mean_diff - (t_cv* stan_error)
lower
[1] -11.17264
But this still doesn't match the confidence interval using t.test
function because t.test
uses Welch's t-test (https://en.wikipedia.org/wiki/Welch%27s_t-test)
So your t-critical value under Welch's t-test should be
# Welch's t test degrees of freedom
welch_df <- (sd1^2/n1 + sd2^2/n2)^2 / (sd1^4/(n1^2*(n1-1)) + sd2^4/(n2^2*(n2-1)))
t_cv <- abs(qt(0.05/2, welch_df))
# Recalculate lower confidence interval
lower= mean_diff - (t_cv* stan_error)
lower
[1] -11.28019 # this matches confidence interval in t.test
Upvotes: 1