MambaMentality
MambaMentality

Reputation: 129

Regression with many variables, but not enough to justify using . and subtracting unnecessary variables

I'm trying to run a regression with roughly 20 variables, in a dataset that has 50 variables. So it looks something like:

lm(data=data, formula = y ~ explanatory_1 + ... + explanatory_20)

Obviously this works fine, but we want the code to look a little cleaner. A lot of answers tell you to use . - however, I don't want to do that, because the dataset has about 20 or so variables that we don't use in the regression. i.e. We'd be subtracting as many variables as we include in the normal regression.

Is there a way to group the explanatory vars into a list, so it can instead look like

lm(data=data, formula = y ~ list)?

Furthermore, in some specifications we include a new covariate that also acts as an interaction term on all the original covariates, so ideally we would have

lm(data=data, formula = y ~ list + new_var + new_var:list).

Can this be done? Thanks!

Upvotes: 0

Views: 60

Answers (1)

otheracct
otheracct

Reputation: 66

You can put the explanatory variables in a vector and use reformulate

x_vars <- c('cyl', 'disp', 'hp')
lm(data = mtcars, formula = reformulate(x_vars, response = 'mpg'))

Upvotes: 3

Related Questions