Reputation: 13
I'm trying to plot error bands around a linear regression. I'm working with the builtin trees
dataset in R. Here's my code. No lines are showing up on the plot. Please help!
xdev <- log(Volume) - mean(log(Volume))
xdev
ydev <- Girth - mean(Girth)
ydev
b1 <- sum((xdev)*(ydev))/sum(xdev^2)
b0 <- mean(Girth) - mean(log(Volume))*b1
plot(log(Volume) ~ Girth)
abline(coef(lm(log(Volume) ~ Girth)), col=2)
(paste(b0, b1))
y.hat <- b0 + b1*log(Volume)
ss.explained <- sum((y.hat - mean(Girth))^2)
ss.unexplained <- sum((y.hat - Girth)^2)
stderr.b1 <- sqrt((ss.unexplained/29)/sum(xdev^2))
stderr.b1
std.yhat <- sqrt((ss.unexplained/29)*((1/31) + (xdev^2/sum(xdev^2))))
std.yhat
upp <- y.hat + qt(0.95, 29)*std.yhat
low <- y.hat - qt(0.95, 29)*std.yhat
upp
low
library(lattice)
plot(log(Volume) ~ Girth, data = trees)
abline(c(b0, b1), lty=1)
points(upp ~ Girth, data=trees, type="l", lty=2)
points(low ~ Girth, data=trees, type="l", lty=2)
Upvotes: 1
Views: 415
Reputation: 73252
Obviously you want to calculate OLS by hand as well as predictions with confidence bands. Since I'm not familiar with the approach you are using (but you definitely mixed up Y and X in the beginning of your code), I show you how to get the confidence bands with a similar manual approach I'm more familiar with.
data(trees) ## load trees dataset
## fit
X <- cbind(1, trees$Girth)
y <- log(trees$Volume)
beta <- solve(t(X) %*% X) %*% t(X) %*% y
y.hat <- as.vector(X %*% beta)
## std. err. y.hat
res <- y - y.hat
n <- nrow(trees); k <- ncol(X)
VCV <- 1/(n - k)*as.vector(t(res) %*% res)*solve(t(X) %*% X)
se.yhat <- sqrt(diag(X %*% VCV %*% t(X)))
## CIs
tcrit <- qt(0.975, n - k)
upp <- y.hat + tcrit*se.yhat
low <- y.hat - tcrit*se.yhat
## plot
with(trees, plot(x=Girth, y=log(Volume)))
abline(a=beta, col=2)
lines(x=trees$Girth, y=upp, lty=2)
lines(x=trees$Girth, y=low, lty=2)
We can compare this with the results of respective R functions, which will give us the same values and, thus, the same plot as above.
fit <- lm(log(Volume) ~ Girth, trees)
pred <- stats::predict.lm(fit, se.fit=TRUE, df=fit$df.residual)
with(trees, plot(x=Girth, y=log(Volume)))
abline(fit, col=2)
lines(x=trees$Girth, y=pred$fit + tcrit*pred$se.fit, lty=2)
lines(x=trees$Girth, y=pred$fit - tcrit*pred$se.fit, lty=2)
Since you have noted lattice
, you can further crosscheck using lattice::xyplot
:
lattice::xyplot(fit$fitted.values ~ log(trees$Volume),
panel=mosaic::panel.lmbands)
Note: It seems you are using attach
, which is discouraged in the community, see: Why is it not advisable to use attach() in R, and what should I use instead?. I therefore used with()
and `$`()
instead.
Upvotes: 0