How does the R Survey package compute bootstrap confidence intervals?

Question

I'm wondering how to use replication weights and the confint implementation of the survey package to construct bootstrap confidence intervals/standard errors.

Looking at the survey package's implementation of confint, it seems as though it's simply taking the standard error of the theta list generated after replication, and multiplying it by the statistic corresponding to a given alpha range.

But that doesn't really correspond with any bootstrap implementation I'm aware of. Typically, you'd instead be using a percentile of the distribution of theta to get the sample mean's distribution's confidence interval. The T and BCa intervals are another matter.

Here is my R Code. I have not used the provided weights, instead letting repweights weights be generated as "sample with replacement" weights with equal probability.

data(api)
d <- apiclus1 %>% select(fpc, dnum,api99)
dclus1<-svydesign(id=~dnum, data=d, fpc=~fpc)
rclus1<-as.svrepdesign(dclus1,type="bootstrap", replicates=100)

To test the confidence intervals, we can use:

test_mean <- svymean(~api99, rclus1)
confint(test_mean, df=degf(rclus1))
confint(test_mean, df=degf(rclus1)) - mean(d$api99)

Which results:

         2.5 %   97.5 %
api99 554.2971 659.6592
          2.5 %   97.5 %
api99 -52.68107 52.68107

So clearly the interval is symmetric, which defeats some of the purpose of using the bootstrap.

So let's try this:

test_bs <- withReplicates(rclus1, function(w, data) weighted.mean(data$api99, w), return.replicates=T)

This will bootstrap the replicates, where the weights are the repweights (which I assume are with replacement weights). Here are the intervals using BCa intervals on the replications:

bca(test$replicates) - mean(d$api99)
-43.2878  49.1148

Clearly not symmetric.

Using percentile intervals:

c(quantile(test$replicates, 0.025),quantile(test$replicates, 0.975))  - mean(d$api99)
     2.5%     97.5% 
-45.50944  48.06085

Valliant implements percentile intervals this way, which should be equivalent to my percentile intervals:

 smho.boot.a <- as.svrepdesign(design = smho.dsgn,
                              type = "subbootstrap",
                              replicates = 500)
         # total & CI for EOYCNT based on RW bootstrap
 a1 <- svytotal( ̃EOYCNT, design = smho.boot.a,
                 na.rm=TRUE,
                 return.replicates = TRUE)
              # Compute CI based on bootstrap percentile method.
 ta1 <- quantile(a1$replicates, c(0.025, 0.975))

I'm looking for clarifications on

a. How to construct bootstrap CIs with survey package for statistics of interest b. If my withReplication implementation of the bootstrap is correct

How does the R Survey package compute bootstrap confidence intervals?

Answers (1)

Related Questions