Vincent
Vincent

Reputation: 1371

R update() interaction term not dropped

Problem: I intend to fit a linear model with interaction terms. After estimating a "full" model, I want to remove the not significant interaction terms. However, after using the function update(lm(),.~.-interaction) on my model, nothing happens. Please help.

Data:

library(car)
data(Prestige)
Prestige_compl <- Prestige[complete.cases(Prestige),] #rm NA's
attach(Prestige_compl)

Model:

modR0 <- lm(prestige   ~
             income             +
             education          +
             women              +
             income    * type   +
             education * type   +
             women     * type   ,
             data      = Prestige_compl) 

# fit a linear model with interaction terms

summary.lm(modR0)

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)        -5.822e+00  7.311e+00  -0.796  0.42803    
income              4.692e-03  6.691e-04   7.013 5.00e-10 ***
education           1.625e+00  9.163e-01   1.773  0.07971 .  
women               1.343e-01  4.656e-02   2.885  0.00494 ** 
typeprof            2.436e+01  1.351e+01   1.803  0.07496 .  
typewc             -2.178e+01  1.727e+01  -1.261  0.21081    
income:typeprof    -4.144e-03  7.132e-04  -5.810 1.03e-07 ***
income:typewc      -7.527e-04  1.814e-03  -0.415  0.67924    
education:typeprof  1.512e+00  1.235e+00   1.224  0.22423    
education:typewc    2.123e+00  2.190e+00   0.970  0.33491    
women:typeprof     -1.601e-01  6.506e-02  -2.460  0.01588 *  
women:typewc        2.893e-02  1.117e-01   0.259  0.79619    

Rm not significant interaction terms:

modR1 <- update(modR0, .~. -women:typewc)
summary.lm(modR1)

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)        -5.822e+00  7.311e+00  -0.796  0.42803    
income              4.692e-03  6.691e-04   7.013 5.00e-10 ***
education           1.625e+00  9.163e-01   1.773  0.07971 .  
women               1.343e-01  4.656e-02   2.885  0.00494 ** 
typeprof            2.436e+01  1.351e+01   1.803  0.07496 .  
typewc             -2.178e+01  1.727e+01  -1.261  0.21081    
income:typeprof    -4.144e-03  7.132e-04  -5.810 1.03e-07 ***
income:typewc      -7.527e-04  1.814e-03  -0.415  0.67924    
education:typeprof  1.512e+00  1.235e+00   1.224  0.22423    
education:typewc    2.123e+00  2.190e+00   0.970  0.33491    
women:typeprof     -1.601e-01  6.506e-02  -2.460  0.01588 *  
women:typewc        2.893e-02  1.117e-01   0.259  0.79619  

Why is the interaction term that should be removed still there?

Upvotes: 4

Views: 3379

Answers (1)

amit
amit

Reputation: 3462

The women:typewc is not the actual interaction term. the real interaction is women:type which breaks down to 2 coefficients because type is a factor with 3 categories. remember that the meaning of the dummy variable (even in interactions) is always a difference between the category and the default category. removing only one category is likely to change the effect of the "default" category - so basically there is no consistent way to interpret an interaction (or a dummy variable) if you remove only few categories from your model. you should either remove all categories or keep all of them.

if you use

 modR1 <- update(modR0, .~. -women:type)

the interaction term will be removed with all its categories. note however that some of the coefficients were, in fact, statistically significant

Upvotes: 4

Related Questions