Reputation: 11
I am trying to create variable name from lists in R, but am struggling!
What I would ultimately like to do is to use previously created lists to create a formula for a multiple linear regression, whereby each value within the list will identify one of the explanatory variables of the regression formula.
I am starting with x lists of variable lengths (GoodModels_LMi, where i goes from 1 to x) and use each list to create a separate formula.
for (i in 1:x){
lm(formula created from appropriate list)
i<-i+1
}
The lists correspond to variable numbers to be chosen from a data matrix (AllData). So for example if:
GoodModels_LM1<-c(2,4,8)
I would like my regression formula to be:
AllData[,1]~AllData[,2]+AllData[,4]+AllData[,8]
I have been trying to use as.formula() and paste() to achieve this, however, I am not sure how to create the second part of my formula.
as.formula(paste("AllData[,",i,"]~",paste(?????????)))
I know that this below is not right, but is as close as I have come:
paste("AllData[,",paste("GoodModels_LM",i,sep=""),"]",collapse="+")
I have also looked into assign(), but have not succeeded as the value argument was the same as the x argument.
Thanks very much for any help with this!
Olivia
Upvotes: 1
Views: 280
Reputation: 179438
Your formula should contain the column names
, not the actual data. Here is a small demo using iris
.
Imagine you want to run a regression using columns 2, 4, and 5 from iris
. First, construct a formula using paste()
:
vars <- c(2, 4, 5)
frm <- paste("Sepal.Length ~ ", paste(names(iris)[vars], collapse=" + "))
frm
"Sepal.Length ~ Sepal.Width + Petal.Width + Species"
So, the object frm
is a string containing a formula that you can pass to lm()
:
lm(frm, iris)
Call:
lm(formula = frm, data = iris)
Coefficients:
(Intercept) Sepal.Width Petal.Width
2.5211 0.6982 0.3716
Speciesversicolor Speciesvirginica
0.9881 1.2376
Upvotes: 2