Reputation: 3678
I know there is a shortcut in R
to run an lm()
regression on all a dataframe like this :
reg<-lm(y~.,data=df)
With df having explanatory variables x1, x2, ... x5, so it is the same as writing
reg<-lm(y~x1+x2+x3+x4+x5,data=df)
But this doesn't include interactions terms like x1:x2, ... Is there a shortcut in R
to run a regression on all columns of a dataframe with the interactions ?
I am looking for 2 shortcuts which will have the same effects as
reg<-lm(y~x1*x2,x1*x3,x1*x4,x1*x5,x2*x3,...)
reg<-lm(y~x1*x2*x3*x4*x5) # this one will have interactions between the 5 variables
Upvotes: 10
Views: 11827
Reputation: 181
The shortcut you are searching for is:
reg <- lm(y ~ (.)^2, data = df)
This will create a model with the main effects and the interactions between regressors.
Upvotes: 18
Reputation: 37879
For both you could use the ^
operator.
See the example:
In your first case you just need the pair-wise interactions (2-way interactions). So you could do:
#Example df
df <- data.frame(a=runif(1:100), b=runif(1:100), c=runif(1:100), d=runif(1:100))
> lm(a ~ (b+c+d)^2, data=df)
Call:
lm(formula = a ~ (b + c + d)^2, data = df)
Coefficients:
(Intercept) b c d b:c b:d c:d
0.53873 0.23531 0.07813 -0.14763 -0.43130 0.11084 0.13181
As you can see the above produced the pair-wise interactions
Now in order to include all the interactions you can do:
> lm(a ~ (b+c+d)^5 , data=df)
Call:
lm(formula = a ~ (b + c + d)^5, data = df)
Coefficients:
(Intercept) b c d b:c b:d c:d b:c:d
0.54059 0.23123 0.07455 -0.15150 -0.42340 0.11926 0.14017 -0.01803
In this case you just need to use a number greater than the number of variables you will use (in this case I use 5 but it could be anything greater than 3). As you see all the interactions are produced.
Upvotes: 11