Reputation: 11
I have a data set with 8 variables. I need all possible two way interaction terms along with the seven predictors in each model. So, in my case there will be total 7C2 = 21 models, each of them containing the 7 predictors and a two way interaction term at a time.
I have tried to produce the 21 models using for loop but the code seems to fail at the lm() function when I try to use that inside the for loop. In my problem return is the response variable at the 5-th column of my data.
colnames(dt) = c("assets","turnover_ratio","SD","sharpe_ratio","return",
"expense_ratio","fund_dummy","risk_dummy")
vars=colnames(dt)[-5]
for (i in vars) {
for (j in vars) {
if (i != j) {
factor= paste(i,j,sep='*')}
lm.fit <- lm(paste("return ~", factor), data=dt)
print(summary(lm.fit))
}}
The error message is given below for the code:
Error in paste("return ~", factor) : cannot coerce type 'closure' to vector of type 'character'
This is my data set:
The output below should be the desired output and 20 more such models are needed with other possible two way interaction terms. All the 7 predictors should be present in each model. The only thing that should change is the two way interaction term.
This is my desired output among the 21 required:
Upvotes: 1
Views: 2945
Reputation: 76450
The following apply
loop gets all pairwise interactions between the 7 variables. The 21 pairs are first obtained with combn
.
vars <- colnames(dt)[-5]
resp <- colnames(dt)[5]
cmb <- combn(vars, 2)
lm_list <- apply(cmb, 2, function(regrs){
inter_regrs <- paste(regrs, collapse = "*")
other_regrs <- setdiff(vars, regrs)
all_regrs <- paste(other_regrs, collapse = "+")
all_regrs <- paste(all_regrs, inter_regrs, sep = "+")
fmla <- as.formula(paste(resp, all_regrs, sep = "~"))
lm(fmla, data = dt)
})
lapply(lm_list, summary)
Data creation code.
set.seed(1234)
dt <- replicate(8, rnorm(100))
dt <- as.data.frame(dt)
colnames(dt) <- c("assets","turnover_ratio","SD",
"sharpe_ratio","return","expense_ratio",
"fund_dummy","risk_dummy")
Upvotes: 2
Reputation: 384
I think this should work and allow you to get rid of the loops:
lm.fit = lm(return ~ (.)^2, data=dt)
Upvotes: 1
Reputation: 1123
Your problem is the end of the if statement. This code should work:
colnames(dt) = c("assets","turnover_ratio","SD","sharpe_ratio","return",
"expense_ratio","fund_dummy","risk_dummy")
vars=colnames(dt)[-5]
for (i in vars) {
for (j in vars) {
if (i != j) {
factor= paste(i,j,sep='*')
lm.fit <- lm(paste0("return ~", factor), data=dt)
print(summary(lm.fit))
}
}
}
The problem was that for the first iteration the variable factor was not define. Also try not to name a variable factor, since factor is a function in R.
Upvotes: 1