ccmullally
ccmullally

Reputation: 41

Interaction terms not showing up in glm

I am puzzling over an issue with glm where interaction terms don't show up in the estimation results unless they are entered as separate variables. I was not able to reproduce the problem using the mtcars data set, but there is nothing unusual about my data.

The first three models below give the same results, while the fourth gives the expected result: Code:

probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)

probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original*pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)

probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original**pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)

df <- df %>% mutate(squared = pop1_3km_original*pop1_3km_original, .after=pop1_3km_original)

probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original+ squared, family = binomial(link = "probit"), data = df, weights = indiv_weight)
summary(probit_model2)

Output:

> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Warning message:
In eval(family$initialize) : non-integer #successes in a binomial glm!
> summary(probit_model2)

Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original, family = binomial(link = "probit"), 
    data = df, weights = indiv_weight)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.784  -3.079  -2.257   2.729   8.409  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.33000    0.01479   22.31   <2e-16 ***
wgrp1             -0.77032    0.01717  -44.87   <2e-16 ***
pop1_3km_original -0.44447    0.01958  -22.70   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29474  on 2325  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29642

Number of Fisher Scoring iterations: 5

> 
> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original*pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Warning message:
In eval(family$initialize) : non-integer #successes in a binomial glm!
> summary(probit_model2)

Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original * pop1_3km_original, 
    family = binomial(link = "probit"), data = df, weights = indiv_weight)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.784  -3.079  -2.257   2.729   8.409  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.33000    0.01479   22.31   <2e-16 ***
wgrp1             -0.77032    0.01717  -44.87   <2e-16 ***
pop1_3km_original -0.44447    0.01958  -22.70   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29474  on 2325  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29642

Number of Fisher Scoring iterations: 5

> 
> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original**pop1_3km_original, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Error in terms.formula(formula, data = data) : invalid power in formula
> summary(probit_model2)

Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original * pop1_3km_original, 
    family = binomial(link = "probit"), data = df, weights = indiv_weight)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.784  -3.079  -2.257   2.729   8.409  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.33000    0.01479   22.31   <2e-16 ***
wgrp1             -0.77032    0.01717  -44.87   <2e-16 ***
pop1_3km_original -0.44447    0.01958  -22.70   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29474  on 2325  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29642

Number of Fisher Scoring iterations: 5

> 
> df <- df %>% mutate(squared = pop1_3km_original*pop1_3km_original, .after=pop1_3km_original)
> 
> probit_model2 <- glm(any_ics99 ~ wgrp1 + pop1_3km_original+ squared, family = binomial(link = "probit"), data = df, weights = indiv_weight)
Warning message:
In eval(family$initialize) : non-integer #successes in a binomial glm!
> summary(probit_model2)

Call:
glm(formula = any_ics99 ~ wgrp1 + pop1_3km_original + squared, 
    family = binomial(link = "probit"), data = df, weights = indiv_weight)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-5.839  -3.056  -2.288   2.755   8.401  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.31527    0.01689  18.669  < 2e-16 ***
wgrp1             -0.76865    0.01720 -44.702  < 2e-16 ***
pop1_3km_original -0.35047    0.05525  -6.344 2.24e-10 ***
squared           -0.07031    0.03878  -1.813   0.0698 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 31927  on 2327  degrees of freedom
Residual deviance: 29471  on 2324  degrees of freedom
  (1239 observations deleted due to missingness)
AIC: 29641

Number of Fisher Scoring iterations: 5

Upvotes: 1

Views: 691

Answers (1)

jay.sf
jay.sf

Reputation: 72828

This type of interaction is actually a polynomial, and * won't work here. You may want to add the squared term, but use the I() function to prevent glm from interpreting it as arithmetic operation.

glm(am ~ mpg*mpg, family=binomial(link='probit'), mtcars)$coe  ## won't work
# (Intercept)         mpg 
#  -3.9102065   0.1822684 

glm(am ~ mpg + I(mpg^2), family=binomial(link='probit'), mtcars)$coe  ## works
#  (Intercept)          mpg     I(mpg^2) 
# -1.504927683 -0.064500921  0.006066692 

Or use poly(..., raw=TRUE)

glm(am ~ poly(mpg, 2, raw=TRUE), family=binomial(link='probit'), mtcars)$coe
 # (Intercept) poly(mpg, 2, raw = TRUE)1 poly(mpg, 2, raw = TRUE)2 
# -1.504927683              -0.064500921               0.006066692 

Or calculate the term beforehand.

glm(am ~ mpg + mpgsq, family=binomial(link='probit'), transform(mtcars, mpgsq=mpg^2))$coe
#  (Intercept)          mpg        mpgsq 
# -1.504927683 -0.064500921  0.006066692 

Upvotes: 2

Related Questions