deviance residual plot logistic regressioon

Question

For my class project, we are supposed to use fit logistic regression on Framingham data set.

fit_select <- glm(Event~Sex+age.group+I(log(Cigar.Day+1))+BP.Med+Prev.Hyp+Diab+ I(log(Tol.Chol))+BMI+Gluc+bp.level, data= data, family = binomial(link="logit"))

When we try to plot deviance residuals, (and I know that those are supposed to be binomial, but we have over 3000 observations, so by CLT those should behave normally)

qqnorm(residuals(fit_select, type = "deviance"))

We get unexpected output

What is wrong? I am not sure how to interpret this.

Benjamin Christoffersen · Accepted Answer

When we try to plot deviance residuals, (and I know that those are supposed to be binomial, but we have over 3000 observations, so by CLT those should behave normally)

It should not be normally distributed when you have binary responses. You need count data for get a normal approximation. As I recall correctly, a rough rule of thumb is something an expected count of 5 or greater for binomial and Poisson models.

What is wrong? I am not sure how to interpret this.

You cannot use the plot for anything when you have binary responses.

deviance residual plot logistic regressioon

Answers (1)

Related Questions