Reputation: 1310
what changes should i do to have a reproducible result here? I run it multiple times and the result vector is different. Thanks for any help.
cl <- makeCluster(2)
registerDoParallel(2)
set.seed(123)
results <- unlist(llply(seq_along(1:4), .fun = function(x){
runif(1)} ,.parallel = T,
.paropts = list(.export=ls(.GlobalEnv))))
stopCluster(cl)
Upvotes: 3
Views: 1322
Reputation: 19677
The following example will give reproducible results on Linux, Mac OS X, and Windows:
library(plyr)
library(doParallel)
cl <- makeCluster(2)
registerDoParallel(cl)
opts <- list(preschedule=TRUE)
clusterSetRNGStream(cl, 123)
r <- llply(1:20,
.fun = function(x) runif(10),
.parallel = TRUE,
.paropts = list(.options.snow=opts))
The preschedule=TRUE
option is needed to prevent doParallel from using load balancing which would make the mapping of tasks to workers unpredictable.
If you're using Linux or Mac OS X and you want doParallel to use mclapply
, you could use this approach:
if (.Platform$OS.type != "windows") {
registerDoParallel(2)
RNGkind("L'Ecuyer-CMRG")
set.seed(123)
mc.reset.stream()
r <- llply(1:20,
.fun = function(x) runif(10),
.parallel = TRUE)
}
This works because mclapply
uses prescheduling by default. It won't work on Windows because doParallel will implicitly create a cluster object, and the RNG initialization won't have any effect.
Note that in your example, you're creating a cluster object but not registering it, so it isn't going to be used by doParallel. You've got to use registerDoParallel(cl)
, otherwise doParallel will either use mclapply
on a Posix computer or an implicitly created cluster object on a Windows computer. Obviously it's very important to initialize the cluster workers that will actually perform the parallel computations.
Upvotes: 4