Reputation: 588
I need to run lengthy and computationally intensive backtests in parallel. My backtests take an average of 8 hours to run and I need to run 30 of them. They all call the same function with different inputs. What I was able to find so far is the below piece of code that uses the foreach package.
require(foreach)
require(parallel)
require(doParallel)
cores = detectCores() #32
cl<-makeCluster(cores) #register cores
registerDoParallel(cl, cores = cores)
foreach (j=1:2) %dopar% {
if(j == 1)
{
get_backtestRUN(inputA)
}
if(j == 2)
{
get_backtestRUN(inputB)
}
}
My first question is more generic and I would like to know if the package above is the best way to solve my issue.
My second question relates to the use of additional computing power as I can only run 8 backtest in parallel on my local machine, there are plenty of options online and would like to have recommendations concerning the most R friendly way to proceed.
Thanks for your help and time,
Upvotes: 2
Views: 185
Reputation: 588
This link answers my question pretty clearly:
https://www.r-bloggers.com/how-to-go-parallel-in-r-basics-tips/
Below the important part:
The idea behind the foreach package is to create ‘a hybrid of the standard for loop and lapply function’ and its ease of use has made it rather popular. The set-up is slightly different, you need “register” the cluster as below:
library(foreach)
library(doParallel)
cl<-makeCluster(no_cores)
registerDoParallel(cl)
Note that you can change the last two lines to:
registerDoParallel(no_cores)
But then you need to remember to instead of stopCluster() at the end do:
stopImplicitCluster()
The foreach function can be viewed as being a more controlled version of the parSapply that allows combining the results into a suitable format. By specifying the .combine argument we can choose how to combine our results, below is a vector, matrix, and a list example:
foreach(exponent = 2:4,
.combine = c) %dopar%
base^exponent
foreach(exponent = 2:4,
.combine = rbind) %dopar%
base^exponent
[,1]
result.1 4
result.2 8
result.3 16
foreach(exponent = 2:4,
.combine = list,
.multicombine = TRUE) %dopar%
base^exponent
[[1]]
[1] 4
[[2]]
[1] 8
[[3]]
[1] 16
Note that the last is the default and can be achieved without any tweaking, just foreach(exponent = 2:4) %dopar%. In the example it is worth noting the .multicombine argument that is needed to avoid a nested list. The nesting occurs due to the sequential .combine function calls, i.e. list(list(result.1, result.2), result.3):
foreach(exponent = 2:4,
.combine = list) %dopar%
base^exponent
[[1]]
[[1]][[1]]
[1] 4
[[1]][[2]]
[1] 8
[[2]]
[1] 16
Upvotes: 1