Reputation: 1848
I have a data frame(df) which has numeric variables. I want to make regression analysis using all combinations of columns(dependent vs. independent variable doesn't matter for me).
So I wrote below code which works fine:
#gtools required for combinations function
library(gtools)
#generate data-3 columns 30 rows data frame
df<-as.data.frame(replicate(3, rnorm(30)))
#extract two combination of columns
comb<-combinations(n=ncol(df),2,colnames(df))
listc<-list()
for(i in 1:nrow(comb))
{
vars<-df[comb[i,]]
model.lm<-lm(vars[,1]~vars[,2],data=df)
listc[[i]]<-coefficients(model.lm)
}
I want to execute this code faster. I try foreach
library to enable parallel processing. However, I can't manage to apply the above application?
How can I apply foreach to this code? I will be very glad for any help. Thanks a lot.
Upvotes: 0
Views: 961
Reputation: 1465
#Load required libraries
library(parallel)
library(foreach)
library(doParallel)
#Register parallel cluster with all cores available minus 1
cl <- makeCluster(detectCores() - 1)
registerDoParallel(cl, cores = detectCores() - 1)
#Extract two combination of columns
comb <- combinations(n=ncol(df),2,colnames(df))
listc <- foreach (i=1:nrow(comb), .packages="gtools", .combine='c') %dopar% {
#Do stuff
vars <- df[comb[i,]]
model.lm <- lm(vars[,1]~vars[,2],data=df)
#Get coeeficients
coef_i <- list(coefficients(model.lm))
coef_i
}
stopCluster(cl)
The returned listc
is a list with the coefficients from each iteration / row in df
.
Upvotes: 1