SDMcLean13
SDMcLean13

Reputation: 55

Limiting a linear model to 3rd level interactions in R

I have a dataset with 14 binary variables. I've already tested for significant single variables, but I'd like to also check for significant interactions. However, I know that higher level interactions are unlikely to be significant and just muddle the model. Is there anyway to run a linear model in R, but tell it to only test for interaction between a maximum of 3 variables?

Upvotes: 3

Views: 482

Answers (2)

CPak
CPak

Reputation: 13591

A manual approach

Use combn to make a triplet combinations of features

Comb <- combn(names(iris)[1:4],3)

Output

     [,1]           [,2]           [,3]           [,4]          
[1,] "Sepal.Length" "Sepal.Length" "Sepal.Length" "Sepal.Width" 
[2,] "Sepal.Width"  "Sepal.Width"  "Petal.Length" "Petal.Length"
[3,] "Petal.Length" "Petal.Width"  "Petal.Width"  "Petal.Width"

Then use as.formula to manually define formula using combinations of 3 features

ans <- apply(Comb, 2, function(x) glm(as.formula(paste0("Species ~ ", paste0(x, collapse=" + "))), data=iris, family=binomial()))
ans

Output

[[1]]

Call:  glm(formula = as.formula(paste0("Species ~ ", paste0(x, collapse = " + "))), 
    family = binomial(), data = iris)

Coefficients:
 (Intercept)  Sepal.Length   Sepal.Width  Petal.Length  
       71.80        -23.91        -13.51         34.95  

Degrees of Freedom: 149 Total (i.e. Null);  146 Residual
Null Deviance:      191 
Residual Deviance: 3.523e-09    AIC: 8

[[2]]

Call:  glm(formula = as.formula(paste0("Species ~ ", paste0(x, collapse = " + "))), 
    family = binomial(), data = iris)

Coefficients:
 (Intercept)  Sepal.Length   Sepal.Width   Petal.Width  
     -25.477         6.762       -19.057        59.292  

Degrees of Freedom: 149 Total (i.e. Null);  146 Residual
Null Deviance:      191 
Residual Deviance: 4.144e-09    AIC: 8

# etc

Upvotes: 1

G. Grothendieck
G. Grothendieck

Reputation: 270248

Using the first 5 columns of the built-in anscombe data set:

lm(y1 ~ .^3, anscombe[1:5])

giving:

Call:
lm(formula = y1 ~ .^3, data = anscombe[1:5])

Coefficients:
(Intercept)           x1           x2           x3           x4        x1:x2  
   12.81992     -2.60371           NA           NA     -0.16258      0.36279  
      x1:x3        x1:x4        x2:x3        x2:x4        x3:x4     x1:x2:x3  
         NA           NA           NA           NA           NA     -0.01345  
   x1:x2:x4     x1:x3:x4     x2:x3:x4  
         NA           NA           NA  

Upvotes: 5

Related Questions