N. McA.
N. McA.

Reputation: 4946

Generalising the creation of a formula in R

I'm trying to create a formula in R, of the form

Output~Var1+Var2+Var3

For use in a model. The way it seems to work is that you give Variable name you want to predict,tilde,the variable names you want to use as predictors and then in a later argument you give the data frame containing observations of those variables. The data frame I'm using, however, has quite a few Variables in it, and I don't want to type them all out. These variables also change names relatively frequently, so it would be an effort to keep changing my code. In essence, I want to know how to write

Output~(All the variables that aren't the output)

Although I also need to exclude some other Variables as well. Sorry to make it quite so clear I don't know what's going on, ?formula didn't help too much, and this isn't like any other programming or R structure I've seen before.

Thanks for any help,

N

Upvotes: 2

Views: 221

Answers (2)

N. McA.
N. McA.

Reputation: 4946

Ah, I found a much better solution: the function

reformulate(termlabels = colnames(InputTable), response = 'Prediction')

Will create a formula from the strings you provide. Manipulate colnames as you like to dynamically choose which variables are used in the model.

Upvotes: 5

N. McA.
N. McA.

Reputation: 4946

Actually, the ?formula documentation provides one possible answer. It is, however, extremely 'hacky', and one of the least pleasant ways I can imagine accomplishing this

## Create a formula for a model with a large number of variables:
xnam <- paste0("x", 1:25)
(fmla <- as.formula(paste("y ~ ", paste(xnam, collapse= "+"))))

ie, you just paste toghether a string and use that as your formula.

Upvotes: 1

Related Questions