Select and run regression on all unique pairs of variables

Question

Say I have the following data:

set.seed(1)
n=1000
x1=rnorm(n,0,1)
x2=rnorm(n,0,1)
x3=rnorm(n,0,1)
d=cbind(x1,x2,x3)

How can I run a single univariate regression on all combinations of variables and then extract the estimate for the slope and SE for each combination?

This means I would need to run:

summary(lm(x1~x2,data=d)) # slope .0018; se .033
summary(lm(x1~x3,data=d)) # slope -.094; se .033
summary(lm(x2~x1,data=d)) # slope .002; se .03
...etc

Ultimately, I'd like output that looks like:

#slopes
      x1   x2     x3
  x1   NA .001  -.094
  x2  ...etc
  x3

 #se
      x1   x2     x3
  x1   NA .033  -.033
  x2  ...etc
  x3

Molx · Accepted Answer

I borrowed part of @DMC's code but used combn instead of nestes loops. It might be more efficient if the number of variables is big, because it only fits each case once.

nvars <- ncol(d)
slopes <- matrix(NA, nrow = nvars, ncol = nvars)
se <- matrix(NA, nrow = nvars, ncol = nvars)

combs <- combn(nvars, 2) #Possible combinations

for (i in 1:ncol(combs)) {
  fit <- summary(lm(d[,combs[1,i]] ~ d[,combs[2,i]]))
  slopes[combs[1,i],combs[2,i]] <- slopes[combs[2,i],combs[1,i]] <- fit$coef[2,1]
  se[combs[1,i],combs[2,i]] <- se[combs[2,i],combs[1,i]] <- fit$coef[2,2]
}
colnames(se) <- rownames(se) <- colnames(slopes) <- rownames(slopes) <- colnames(d)

Select and run regression on all unique pairs of variables

Answers (2)

Related Questions