Reputation: 145
I'm fitting several multi-variable linear models using lm()
Basically matrix1 holds the dependent variables (y) and matrix2 the independent ones (x)
model.1<-lm(matrix1[, 1] ~ matrix2)
Where matrix2
has a variable number of columns depending on the specific combination of parameters I want in the regression, no zero-value columns in matrix2
.
This statement works fine for a lineal model with no interaction between independent variables (IV), (a model like this: a0 + a1*x1 + a2*x2 ...
), but if I want to introduce interaction between the IV manual indicates to use the operator * between the variables (model.1 <- lm(matrix1[, 1] ~ x1 * x2 * x3)
). How can I apply this when the IV are in a matrix?
Upvotes: 0
Views: 1568
Reputation: 269852
1) SO questions are supposed to provide the test data reproducibly but here we have done it for you using the builtin data.frame anscombe
. After defining the test data we define a data frame containing the columns we want and the appropriate formula. Finally we call lm
:
# test data
matrix1 <- as.matrix(anscombe[5:8])
matrix2 <- as.matrix(anscombe[1:4])
DF <- data.frame(matrix1[, 1, drop = FALSE], matrix2) # cols are y1, x1, x2, x3, x4
fo <- sprintf("%s ~ (.)^%d", colnames(matrix1)[1], ncol(matrix2)) # "y1 ~ (.)^4"
lm(fo, DF)
giving:
Call:
lm(formula = fo, data = DF)
Coefficients:
(Intercept) x1 x2 x3 x4 x1:x2
12.8199 -2.6037 NA NA -0.1626 0.3628
x1:x3 x1:x4 x2:x3 x2:x4 x3:x4 x1:x2:x3
NA NA NA NA NA -0.0134
x1:x2:x4 x1:x3:x4 x2:x3:x4 x1:x2:x3:x4
NA NA NA NA
2) A variation of this which gives a slightly nicer result in the Call:
part of the lm
output is the following. We use DF
from above. do.call
will pass the contents of the fo
variable rather than its name so that we see the formula in the Call:
part of the output. On the other hand, quote(DF)
is used to force the name DF
to display rather than the contents of the data.frame.
lhs <- colnames(matrix1)[1]
rhs <- paste(colnames(matrix2), collapse = "*")
fo <- paste(lhs, rhs, sep = "~") # "y1~x1*x2*x3*x4"
do.call("lm", list(fo, quote(DF)))
giving:
Call:
lm(formula = "y1 ~ x1*x2*x3*x4", data = DF)
Coefficients:
(Intercept) x1 x2 x3 x4 x1:x2
12.8199 -2.6037 NA NA -0.1626 0.3628
x1:x3 x2:x3 x1:x4 x2:x4 x3:x4 x1:x2:x3
NA NA NA NA NA -0.0134
x1:x2:x4 x1:x3:x4 x2:x3:x4 x1:x2:x3:x4
NA NA NA NA
Upvotes: 1