user1687130
user1687130

Reputation: 1881

R: likelihood ratio test comparing two models, however missing data made the two models not in the same dimension

I'm trying to do a likelihood ratio test between two models.

glm.model1 <- glm(result ~ height + weight )
glm.model2 <- glm(result ~ hight + weight + speed + speed : height + speed : weight )
require(lmtest)    
a <- lrtest(glm.model1, glm.model2)

And I got the following error:

Error in lrtest.default(glm.model1, glm.model2) : 
models were not all fitted to the same size of dataset

I know some of my "speed" data are missing, but none of the height and weight data are missing, so since model 2 includes variable "speed" but model 1 doesn't, model 2 has datapoints got deleted by glm due to missingness. So when I do likelihood ratio test between model 2 and model 1, the data dimension are not equal, and I end up with the error message like above. Is there a way I can look up what datapoints are deleted in model 2, so in my reduced model I can include some script to delete the same datapoint in order to keep the dimension of data same?

Here's what I've tried:

1) add na.action = na.pass to keep all the missing data in the model 2, but it doesn't work.

2) tried:

glm.model1 <- glm(result ~ height + weight + speed - speed )
## This does work and it gets rid of the sample with "speed" missing, but this is like cheating. 

Here's the summary of each model:

summary(glm.model1)

......

    Null deviance: 453061  on 1893  degrees of freedom
Residual deviance: 439062  on 1891  degrees of freedom
AIC: 15698

Number of Fisher Scoring iterations: 2

Number of Fisher Scoring iterations: 2

summary(glm.model2)

......
    Null deviance: 451363  on 1887  degrees of freedom
Residual deviance: 437137  on 1882  degrees of freedom
  (6 observations deleted due to missingness)          ## This is what I want to look at:
AIC: 15652
 Number of Fisher Scoring iterations: 2

How can I look at the observations that are deleted and write into the script to delete the same observations in the other model? Thanks!

Upvotes: 3

Views: 11144

Answers (1)

Patrick Coulombe
Patrick Coulombe

Reputation: 320

You can use the subset argument of the glm() function:

glm.model1 <- glm(result ~ height + weight, subset=!is.na(speed) )

Upvotes: 5

Related Questions