Reputation: 1750
I am using a package in R that fits a specific form of a regression model. However, unlike the base lm() function that permits the x and y to be separate objects, the function that I'm using requires them to be in the same dataframe.
My problem arises because I have a lot of variables that I want to regress on y independently. Therefore, I have a dataframe with 10 predictor variables (x1, x2... x10) and one criterion variable (y), 11 columns in total. I could use a for loop to run ten separate regressions, but I want to avoid it and use the apply function instead. However, if I call apply on my dataframe, in the last step it will regress y on y itself and I want to avoid this. Is there a function similar to apply which I could run and specify thiat I only want it to run 10 times and not 11, or is there another workaround to this problem?
Upvotes: 0
Views: 576
Reputation: 13731
Here's a tidyverse
solution:
library( tidyverse )
xx <- c("disp", "hp", "drat", "wt") # Names of predictor variables
y <- "mpg" # Name of response
str_c( y, xx, sep="~" ) %>%
map( as.formula ) %>% # Optional (see below)
map( lm, data = mtcars )
str_c
simply builds up formulas as strings (e.g., "mpg~disp"
). While lm
accepts strings directly, your particular regression model might not. If it requires an actual formula, you can convert strings to formulas using as.formula
(Thanks for the suggestion, @J.Doe!). Other than that, simply replace lm
with your particular model and mtcars
with your data frame.
Here's the same solution using base R without any additional packages:
strs <- paste( y, xx, sep="~" )
strs <- lapply( strs, as.formula ) # Optional
lapply( strs, lm, data=mtcars )
Upvotes: 2
Reputation: 270248
Using the builtin anscombe
data frame having columns x1
, x2
, x3
, x4
, y1
, y2
, y3
, y4
suppose we want to regress y1
on each of x1
, x2
, x3
, x4
separately.
First create a character vector of the names of the independent variables, xnames
, and the use lapply
to run the indicated run_lm
over it. That function pastes together the required formula and performs the lm
returning an "lm"
class object. L
, the result, is a list of such objects, one for each regression.
No packages are used.
xnames <- names(anscombe)[1:4]
run_lm <- function(nm) lm(paste("y1 ~", nm), anscombe)
L <- lapply(xnames, run_lm)
Alternately, this shorter version of run_lm
would also work with the above lapply
but the Call:
output line is not as nice:
run_lm <- function(nm) lm(anscombe[c("y1", nm)])
Upvotes: 0