Reputation: 13
I am trying to repeat a set of linear regressions on pairs of variables inside a data table. I have three independent variables y1
, y2
, y3
and 10 explanatory variables x1
to x10
. Some observations are missing in each series.
In the example below , I would like to repeat the second line of command for each pairs of ys
and xs
.
d <- data.table(country=rep(c('a','b','c'),c(10,10,10)),y1=rnorm(30),y2=rnorm(30),x1=runif(30),x2=runif(30))
d[(!is.na(y1) & !is.na(x1)), .(beta1=summary(lm(y1~x1))$coefficients[2,1], p1=summary(lm(y1~x1))$coefficients[2,4]) ,by=country]
Upvotes: 1
Views: 318
Reputation: 25225
Here is a more base approach. You can generate a combinations of x's and y's using data.table::CJ
or expand.grid
. Then go through each combination to perform your linear regression.
combi <- CJ(grep("^x", names(d), value=TRUE),grep("^y", names(d), value=TRUE))
lmRes <- apply(combi, 1, function(x) {
fml <- as.formula(paste(x["V2"],"~",x["V1"]))
lm(fml, d)
})
lmRes
Short of generating a large data set from d
of all combinations of x's and y's before joining with the combinations, there is probably no simpler way to solve this problem by joining tables.
Upvotes: 1