yap
yap

Reputation: 61

R: Bootstrap percentile confidence interval

library(boot)
set.seed(1)
x=sample(0:1000,1000)
y=function(u,i) sum(x[i])
o=boot(x,y,1000)
theta1=NULL
theta1=cbind(theta1,o$t)
b=theta1[order(theta1)]
bp1=c(b[25], b[975])
ci=boot.ci(o,type="perc")

I am using two method to construct bootstrap percentile confidence interval but I got two different answer.

bp1=c(b[25], b[975]) get (480474,517834)

while ci=boot.ci(o,type="perc") get (480476, 517837 )

How does the boot.ci construct the percentile interval?

Upvotes: 5

Views: 2582

Answers (2)

The standard interval using basic boostrap I always use is:

est <- est.from.bootstrap
basic.bs <- c(2*est-quantile(bootstrap.vector, prob=0.975), 2*est -
quantile(bootstrap.vector, prob=0.025)

You can also just use the normal bootstrap interval given by:

est <- est.from.bootstrap
bs.interval <-c(est + sd(bootstrap.vector)*qnorm(0.025), est +
sd(bootstrap.vector)*qnorm(0.975)

However you can also use just the normal percentile method:

est <- est.from.bootstrap
perc <- c(quantile(bootstrap.vector, prob=0.025), quantile(bootstrap.vector, 
prob=0.075)

Upvotes: 1

Bastien
Bastien

Reputation: 3098

By calling the function by itself boot.ci, the script appears. You can then see that the percentile CI is calculated using the function perc.ci (around line 70). On Github, you can get the package script. Looking for the perc.ci function, your find this:

perc.ci <- function(t, conf = 0.95, hinv = function(t) t)
  #
  #  Bootstrap Percentile Confidence Interval Method
  #
{
  alpha <- (1+c(-conf,conf))/2
  qq <- norm.inter(t,alpha)
  cbind(conf,matrix(qq[,1L],ncol=2L),matrix(hinv(qq[,2]),ncol=2L))
}

Which then leads to the norm.inter function which seems to be the one creating the vector for extracting the percentiles. Looking for this function in the same Github script tel us:

Interpolation on the normal quantile scale. For a non-integer order statistic this function interpolates between the surrounding order statistics using the normal quantile scale. See equation 5.8 of Davison and Hinkley (1997)

So it seems it's using interpolation from a normal distribution explaining why it's different from your totally empirical solution.

Upvotes: 4

Related Questions