matrivi
matrivi

Reputation: 87

Shaded confidence interval bands for glm coefficients with covariates set to mean values

I would like to plot the line and the shaded 95% confidence interval bands (for example using polygon)from a glm model (family binomial)or using gglot. For linear models (lm), I have previously been able to plot the confidence intervals from the predictions as they included the fit, lower and upper level but I do not know how to do it here. I have tried to use the function predict.glm with the optional argument se.fit set to TRUE, and then using the prediction +/- 1.96 * std.error to calculate the confidence intervals but it did not work for me. Thanks for help in advance. You can find here the data that I used (it contains 10 variables and 996 observations): https://drive.google.com/file/d/1Yu7Dk2eh0R1ztKiuNTtN_W5Yg4C2Ne-2/view?usp=sharing Code and figure here:

# Models
mod= glm(site ~S + age + pH + soil + peat+
              spruce+ I(spruce^2)+pine+ birch+ 
              tsumma+ I(tsumma^2), 
              data=test.dat,family=binomial)


# Means of all covariates
means = apply(test.dat[,c("S", "pH","soil", "spruce", "pine","birch", "tsumma")],2,mean,na.rm=T)

# Calculate the constant given by all other covariates being at their means and assuming only pine on the plot
const = mod$coefficients[1]+
  mod$coefficients["S"]*means["S"]+
  mod$coefficients["pH"]*means["pH"]+
  mod$coefficients["soil"]*means["soil"]+
  mod$coefficients["spruce"]*means["spruce"]+
  mod$coefficients["I(spruce^2)"]*means["spruce"]*means["spruce"]+
  mod$coefficients["pine"]*means["pine"]+
  mod$coefficients["birch"]*means["birch"]+
  mod$coefficients["tsumma"]*means["tsumma"]+
  mod$coefficients["I(tsumma^2)"]*means["tsumma"]*means["tsumma"]

# Plot
age = seq(from=min(test.dat$age,na.rm=T),to=150,length=100)
lin= const + mod$coefficients["age"]*age
Pr = exp(lin) / (exp(lin)+1)
par(mar = c(4, 4, 1.5, 0.3))
plot(age,Pr,type="l", ylim=c(0,.5),las=1, main="Probability of hotspot", ylab="Probability of occurrence",xlab="Forest age (years)")

enter image description here

Upvotes: 1

Views: 246

Answers (1)

StupidWolf
StupidWolf

Reputation: 46968

You can use a package, indicating the term to plot while holding others constant:

library(sjPlot)
set.seed(888)
data = mtcars
data$vs = data$vs + rnorm(nrow(data))
mod = glm(am ~ disp + vs + carb+ I(vs^2),data=data,family="binomial")
plot_model(mod,type="pred",terms="disp")

enter image description here

Or derive it like you did, except I think you might need to create the extra term for the squared value, so that you can hold the other terms at their means, and use the predict.lm function :

data$vs2 = data$vs^2
mod = glm(am ~ disp + vs + carb+ vs2,data=data,family="binomial")
varMeans = colMeans(mod$model)[c("vs","carb","vs2")]
pred_disp = seq(min(data$disp),max(data$disp),length.out=100)
df = data.frame(
                disp = pred_disp,
                t(replicate(length(pred_disp),varMeans))
               )
pred = predict(mod,df,se=TRUE)

plot(df$disp,plogis(pred$fit),"l")
lines(df$disp,plogis(pred$fit + 1.96*pred$se.fit),col="blue",lty=8)
lines(df$disp,plogis(pred$fit - 1.96*pred$se.fit),col="blue",lty=8)

enter image description here

Upvotes: 1

Related Questions