Reputation: 5646
I was unsure whether this question would be more appropriate here or on Cross Validated. I hope I made the right choice.
Consider the example:
library(dplyr)
setosa <- iris %>% filter(Species == "setosa") %>% select(Sepal.Length, Sepal.Width, Species)
library(ggplot2)
ggplot(data = setosa, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_smooth(method ="lm", formula = y ~ poly(x,2))
By default, ggplot
"displays confidence interval around smooth" (see here), given by the gray area around the regression curve. I've always assumed these are simultaneous confidence bands for the regression curve, not pointwise confidence bands. ggplot2
documentation refers to the predict
function for details on how the standard errors are computed. However, reading the doc for predict.lm, it doesn't say explicitly that simultaneous confidence bands are computed. So, what is the correct interpretation here?
Upvotes: 1
Views: 2529
Reputation: 13108
One way to check what predict.lm()
computes is to inspect the code (predict
multiplies standard errors by qt((1 - level)/2, df)
, and so does not appear to make adjustments for simultaneous inference). Another way is to construct simultaneous confidence intervals and compare them against predict
's intervals.
Fit the model and construct simultaneous confidence intervals:
setosa <- subset(iris, Species == "setosa")
setosa <- setosa[order(setosa$Sepal.Length), ]
fit <- lm(Sepal.Width ~ poly(Sepal.Length, 2), setosa)
K <- cbind(1, poly(setosa$Sepal.Length, 2))
cht <- multcomp::glht(fit, linfct = K)
cci <- confint(cht)
Reshape and plot:
cc <- as.data.frame(cci$confint)
cc$Sepal.Length <- setosa$Sepal.Length
cc <- reshape2::melt(cc[, 2:4], id.var = "Sepal.Length")
library(ggplot2)
ggplot(data = setosa, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_smooth(method ="lm", formula = y ~ poly(x,2)) +
geom_line(data = cc,
aes(x = Sepal.Length, y = value, group = variable),
colour = "red")
It appears that predict(.., interval = "confidence")
does not produce simultaneous confidence intervals:
Upvotes: 4