fast method to calculate p-values from .lm.fit()

Question

I am running simulations and fitting linear models with .lm.fit(). Although extremely fast, this function does not provide the predictors' p-values. Is there a fast way to compute them (maybe from the values returned by .lm.fit())? I am aware of this method to calculate approximate p-values, but I need the exact ones.

Update:
Dirk Eddelbuettel provided the fastest way to fit the lm and Ben Bolker a method to compute the p-values, by combining both answers we get:

set.seed(101)
X <- cbind(1,matrix(1:10))
y <- rnorm(10)

mdl <- RcppArmadillo::fastLmPure(X, y)

pval <- 2*pt(abs(mdl$coefficients/mdl$stderr), mdl$df.residual, lower.tail=FALSE)

Ben Bolker · Accepted Answer

Dirk's answer will be faster, but in case it's convenient here's the implementation in pure R (extracting the bits you need from summary.lm, and assuming there are no issues with non-full-rank model matrices etc.)

Example:

set.seed(101)
X <- cbind(1,matrix(1:10))
y <- rnorm(10)
m <- .lm.fit(X,y)

p-value calcs:

rss <- sum(m$residuals^2)
rdf <- length(y) - ncol(X)
resvar <- rss/rdf
R <- chol2inv(m$qr)
se <- sqrt(diag(R) * resvar)
2*pt(abs(m$coef/se),rdf,lower.tail=FALSE)

Compare with:

coef(summary(lm(y~X-1)))[,"Pr(>|t|)"]

fast method to calculate p-values from .lm.fit()

Answers (2)

Related Questions