Reputation: 165
For my class project, we are supposed to use fit logistic regression on Framingham data set.
fit_select <- glm(Event~Sex+age.group+I(log(Cigar.Day+1))+BP.Med+Prev.Hyp+Diab+ I(log(Tol.Chol))+BMI+Gluc+bp.level, data= data, family = binomial(link="logit"))
When we try to plot deviance residuals, (and I know that those are supposed to be binomial, but we have over 3000 observations, so by CLT those should behave normally)
qqnorm(residuals(fit_select, type = "deviance"))
We get
What is wrong? I am not sure how to interpret this.
Upvotes: 0
Views: 747
Reputation: 4841
When we try to plot deviance residuals, (and I know that those are supposed to be binomial, but we have over 3000 observations, so by CLT those should behave normally)
It should not be normally distributed when you have binary responses. You need count data for get a normal approximation. As I recall correctly, a rough rule of thumb is something an expected count of 5 or greater for binomial and Poisson models.
What is wrong? I am not sure how to interpret this.
You cannot use the plot for anything when you have binary responses.
Upvotes: 1