Reputation: 11
How does one run multivariate linear regressions if there are a lot of dependent variables, say in my case 222 to be exact? I want to regress certain values I have for 222 different companies with few regressors.
I know i can do e.g.
y <- cbind(y1, y2, y3... yn)
fit <- lm(y ~ X1 + X2 + ... Xn)
But there has to be a clever way to cbind my columns other than writing by hand cbind(y1, y2, y3, ...y222)
- right?
I have tried cbind(vol[, 2:223])
but placing that in y and given to lm() function only results in Error in model.frame.default(formula = y ~ RMF + SMB + HML, drop.unused.levels = TRUE) :
invalid type (list) for variable 'y'
Not very experienced with R so I appreciate all the help I can get for my thesis! Please bear with me.
Upvotes: 0
Views: 2315
Reputation: 269824
Below we use the built-in anscombe
data frame as an example.
1) The key part is to use a matrix, not a data frame, for the left hand side of the formula. In the example below we define a matrix y
of the dependent variables and then use that with lm
:
y <- as.matrix(anscombe[5:8])
lm(y ~ x1 + x2 + x3 + x4, anscombe)
1a) or if there are many independent variables too:
lm(y ~ ., anscombe[1:4])
2) One could alternately use lm.fit
. Note that it does not automatically add an intercept so we add one:
m <- as.matrix(anscombe)
lm.fit(cbind(Intercept = 1, m[, 1:4]), m[, 5:8])
lm.fit
returns a list rather than an lm
object but some methods such as coef
and resid
(but not summary
) work with it anyways.
Upvotes: 1
Reputation: 21729
Assuming, all 222 vectors are of same length and your global environment has no other object except these 222 vectors, you can try:
Method 1:
library(purrr)
# get all 223 vectors in a list
vec_list <- as.list(.GlobalEnv))
# cbind the list elements
df <- map_df(vec_list, cbind)
Method 2:
# this lists all the objects in your current environment
vec_list= ls()
# get data
df <- cbind(vec_list)
Upvotes: 0