Reputation: 11995
What is the most practical way of extracting the global p-value of a linear model, lm
? I usually end up taking the results from summary
and plugging the F-test statistic and degrees of freedom into pf
:
set.seed(1)
n <- 10
x <- 1:10
y <- 2*x+rnorm(n)
fit <- lm(y ~ x)
summary(fit) # global p-value: 1.324e-08
fstat <- summary(fit)$fstat
pval <- pf(fstat[1], fstat[2], fstat[3], lower.tail = FALSE)
pval
Upvotes: 3
Views: 216
Reputation: 132706
Since you asked for it:
Here is a bare-bones implementation that omits the bells and whistles (and checks) of lm
. As a consequence it is faster, but you'd use it at your own risk, i.e., the warnings in help("lm.fit")
apply. Due to laziness, code for calculation of the F-stats was extracted from the summary.lm
source code and only slightly amended (so please consider licence()
and citation("stats")
).
fit1 <- lm.fit(cbind(1, x), y)
fstats <- function(obj) {
p <- obj$rank
rdf <- obj$df.residual
r <- obj$residuals
f <- obj$fitted.values
mss <- sum((f - mean(f))^2)
rss <- sum(r^2)
resvar <- rss/rdf
df.int <- 1L #assumes there is always an intercept
fstatistic <- c(value = (mss/(p - df.int))/resvar,
numdf = p - df.int, dendf = rdf)
fstatistic["pval"] <- pf(fstatistic[1L],
fstatistic[2L],
fstatistic[3L], lower.tail = FALSE)
fstatistic
}
fstats(fit1)
# value numdf dendf pval
#5.321048e+02 1.000000e+00 8.000000e+00 1.324022e-08
Upvotes: 2
Reputation: 20463
Check out the broom package:
library(broom)
set.seed(1)
n <- 10
x <- 1:10
y <- 2*x+rnorm(n)
fit <- lm(y ~ x)
glance(fit)
# r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual
# 1 0.9851881 0.9833366 0.8090653 532.1048 1.324022e-08 2 -10.95491 27.90982 28.81758 5.236693 8
glance(fit)$p.value
# [1] 1.324022e-08
tidy(fit)
# term estimate std.error statistic p.value
# 1 (Intercept) -0.1688236 0.55269681 -0.3054542 7.678170e-01
# 2 x 2.0547321 0.08907516 23.0673979 1.324022e-08
Upvotes: 5
Reputation: 368
Check the source of print.summary.lm, it uses the pf function to get the pvalue.
format.pval(pf(x$fstatistic[1L],
x$fstatistic[2L], x$fstatistic[3L], lower.tail = FALSE),
digits = digits))
Upvotes: 0