Reputation: 10980
I'm using the boot
package in R to calculate bootstrapped SEs and confidence intervals. I'm trying to find an elegant and efficient way of getting the names of my parameters along with the bootstrap distribution of their estimates. For instance, consider the simple example given here:
# Bootstrap 95% CI for regression coefficients
library(boot)
# function to obtain regression weights
bs = function(data, indices, formula) {
d = data[indices,] # allows boot to select sample
fit = lm(formula, data=d)
return(coef(fit))
}
# bootstrapping with 1000 replications
results = boot(
data=mtcars,
statistic=bs,
R=1000,
formula=mpg~wt+disp)
This works fine, except that the results just appear as numerical indices:
# view results
results
Bootstrap Statistics :
original bias std. error
t1* 34.96055404 0.1559289371 2.487617954
t2* -3.35082533 -0.0948558121 1.152123237
t3* -0.01772474 0.0002927116 0.008353625
Particularly when getting into long, complicated regression formulas, involving a variety of factor variables, it can take some work to keep track of precisely which indices go with which coefficient estimates.
I could of course just re-fit my model again outside of the bootstrap function, and extract the names with names(coef(fit))
or something, or likely use something else such as a call to model.matrix()
. These seem cumbersome, both in terms of extra coding but also in terms of extra CPU and ram resources.
How can I more easily get a nice vector of the coefficient names to pair a vector of coefficient standard errors in situations like this?
UPDATE
Based on the great answer from lmo, here is my basic code to get a basic regression table:
Names = names(results$t0)
SEs = sapply(data.frame(results$t), sd)
Coefs = as.numeric(results$t0)
zVals = Coefs / SEs
Pvals = 2*pnorm(-abs(zVals))
Formatted_Results = cbind(Names, Coefs, SEs, zVals, Pvals)
Upvotes: 4
Views: 1314
Reputation: 38500
The estimates from calling the "boot strapped" function, here lm
, on the original data, are stored in an element of the list called "t0".
results$t0
(Intercept) wt disp
34.96055404 -3.35082533 -0.01772474
This object preserves the names of the estimates from original function call, which you can then access with names
.
names(results$t0)
[1] "(Intercept)" "wt" "disp"
Upvotes: 3