Parseltongue
Parseltongue

Reputation: 11677

Add all possible two-way interactions between two sets of variables - R

Let's say I have the following specification:

glm(death ~ age + black + hisp + other + rich + middle, family = binomial("probit"), data=data)

Is there a straightforward way to add all two-way interactions between the "ethnicity group" (black, hisp, and other) and the "income group" (rich, middle). So the interactions would be blackrich, blackmiddle, hisp*rich, and so on.

Upvotes: 2

Views: 1582

Answers (2)

Parfait
Parfait

Reputation: 107652

Consider pasting all combinations inside the formula:

vars1 <-  c('black', 'hisp', 'other')
vars2 <-  c('rich', 'middle')
interactions <- outer(vars1, vars2, function(x,y){paste0(x,'*',y)})
intjoin <- paste(interactions, collapse=" + ")
#[1] "black*rich + hisp*rich + other*rich + black*middle + hisp*middle + other*middle"

model <- glm(paste0('death ~ age + black + hisp + other + rich + middle + ', intjoin), 
             family = binomial("probit"), data=data) 

Upvotes: 1

IRTFM
IRTFM

Reputation: 263382

The formula interface lets you do that easily with the ^-operator where you could construct all the 2way interactions from two factor variables by (ethnicity + incgrp)^2 , but that only applies if you use the R factor conventions. It appears you are attempting to circumvent the proper use of formulas and factors by instead doing SAS-style dummy variable creation. For your situation, you might try:

glm(death ~ age + (black + hisp + other)*( rich + middle), family = binomial("probit"), data=data)

The formula interpretation uses both ^ and * to construct interactions. They loose their conventional mathematical meaning. See ?formula

Upvotes: 3

Related Questions