User 6683331
User 6683331

Reputation: 710

Use string of independent variables within the lm function

I have a dataframe with many variables. I want to apply a linear regression to explain the last one with the others. So as I had to much to write I thought about creating a string with the independent variables e.g. Var1 + Var2 +...+ VarK. I achieved it pasting "+" to all column names except for the last one with this code:

ExVar <- toString(paste(names(datos)[1:11], "+ ", collapse = ''))

I also had to remove the last "+":

ExVar <- substr(VarEx, 1, nchar(ExVar)-2)

So I copied and pasted the ExVar string within the lm() function and the result looked like this:

m1 <- lm(calidad ~ Var1 + Var 2 +...+ Var K)

The question is: Is there any way to use "ExVar" within the lm() function as a string, not as a variable, to have a cleaner code?

For better understanding:

If I use this code:

m1 <- lm(calidad ~ ExVar)

It is interpreting ExVar as a independent variable.

Upvotes: 1

Views: 2923

Answers (2)

Onyambu
Onyambu

Reputation: 79228

if you have a dataframe, and you want to explain the last one using all the rest then you can use the code below:

 lm(calidad~.,dat)

or you can use

 lm(rev(dat))#Only if the last column is your response variable

Any of the two above will give you the results needed.

To do it your way:

 EXV=as.formula(paste0("calidad~",paste0(names(datos)[-12],collapse = '+')))
 lm(EXV,dat)

There is no need to do it this way since the lm function itself will do this by using the first code above.

Upvotes: 2

Alex Brodersen
Alex Brodersen

Reputation: 73

The following will all produce the same results. I am providing multiple methods because there is are simpler ways of doing what you are asking (see examples 2 and 3) instead of writing the expression as a string.

First, I will generate some example data:

n <- 100
p <- 11
dat <- array(rnorm(n*p),c(n,p))

dat <- as.data.frame(dat)
colnames(dat) <- paste0("X",1:p)

If you really want to specify the model as a string, this example code will help:

ExVar <- toString(paste(names(dat[2:11]), "+ ", collapse = ''))
ExVar <- substr(ExVar, 1, nchar(ExVar)-3)
model1 <- paste("X1 ~ ",ExVar) 
fit1 <- lm(eval(parse(text = model1)),data = dat)

Otherwise, note that the 'dot' notation will specify all other variables in the model as predictors.

fit2 <- lm(X1 ~ ., data = dat)

Or, you can select the predictors and outcome variables by column, if your data is structured as a matrix.

dat <- as.matrix(dat)
fit3 <- lm(dat[,1] ~ dat[,-1])

All three of these fit objects have the same estimates:

fit1
fit2
fit3

Upvotes: 2

Related Questions