Reputation: 11518
I need to do some simulations and for debugging purposes I want to use set.seed
to get the same result. Here is the example of what I am trying to do:
library(foreach)
library(doMC)
registerDoMC(2)
set.seed(123)
a <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
set.seed(123)
b <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
Objects a
and b
should be identical, i.e. sum(abs(a-b))
should be zero, but this is not the case. I am doing something wrong, or have I stumbled on to some feature?
I am able to reproduce this on two different systems with R 2.13 and R 2.14
Upvotes: 27
Views: 11527
Reputation: 2319
Using set.seed(123, kind = "L'Ecuyer-CMRG")
also does the trick and does not require an extra package:
set.seed(123, kind = "L'Ecuyer-CMRG")
a <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
set.seed(123, kind = "L'Ecuyer-CMRG")
b <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
identical(a,b)
# TRUE
Upvotes: 10
Reputation: 369
For more complicated loops, you might have to include set.seed() inside of the for loop:
library(foreach)
library(doMC)
registerDoMC(2)
library(doRNG)
set.seed(123)
a <- foreach(i=1:2,.combine=cbind) %dopar% {
create_something <- c(1, 2, 3)
rnorm(5)
}
set.seed(123)
b <- foreach(i=1:2,.combine=cbind) %dopar% {
create_something <- c(4, 5, 6)
rnorm(5)
}
identical(a, b)
# FALSE
versus
a <- foreach(i=1:2,.combine=cbind) %dopar% {
create_something <- c(1, 2, 3)
set.seed(123)
rnorm(5)
}
b <- foreach(i=1:2,.combine=cbind) %dopar% {
create_something <- c(4, 5, 6)
set.seed(123)
rnorm(5)
}
identical(a, b)
# TRUE
Upvotes: 5
Reputation: 368439
My default answer used to be "well then don't do that" (using foreach) as the snow package does this (reliably!) for you.
But as @Spacedman points out, Renaud's new doRNG is what you are looking for if you want to remain with the doFoo
/ foreach family.
The real key though is a clusterApply-style call to get the seeds set on all nodes. And in a fashion that coordinated across streams. Oh, and did I mention that snow by Tierney, Rossini, Li and Sevcikova has been doing this for you for almost a decade?
Edit: And while you didn't ask about snow, for completeness here is an example from the command-line:
edd@max:~$ r -lsnow -e'cl <- makeSOCKcluster(c("localhost","localhost"));\
clusterSetupRNG(cl);\
print(do.call("rbind", clusterApply(cl, 1:4, \
function(x) { stats::rnorm(1) } )))'
Loading required package: utils
Loading required package: utils
Loading required package: rlecuyer
[,1]
[1,] -1.1406340
[2,] 0.7049582
[3,] -0.4981589
[4,] 0.4821092
edd@max:~$ r -lsnow -e'cl <- makeSOCKcluster(c("localhost","localhost"));\
clusterSetupRNG(cl);\
print(do.call("rbind", clusterApply(cl, 1:4, \
function(x) { stats::rnorm(1) } )))'
Loading required package: utils
Loading required package: utils
Loading required package: rlecuyer
[,1]
[1,] -1.1406340
[2,] 0.7049582
[3,] -0.4981589
[4,] 0.4821092
edd@max:~$
Edit: And for completeness, here is your example combined with what is in the docs for doRNG
> library(foreach)
R> library(doMC)
Loading required package: multicore
Attaching package: ‘multicore’
The following object(s) are masked from ‘package:parallel’:
mclapply, mcparallel, pvec
R> registerDoMC(2)
R> library(doRNG)
R> set.seed(123)
R> a <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
R> set.seed(123)
R> b <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
R> identical(a,b)
[1] FALSE ## ie standard approach not reproducible
R>
R> seed <- doRNGseed()
R> a <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> b <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> doRNGseed(seed)
R> a1 <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> b1 <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> identical(a,a1) && identical(b,b1)
[1] TRUE ## all is well now with doRNGseed()
R>
Upvotes: 21
Reputation: 94267
Is the doRNG package any use to you? I suspect your problem is due to two threads both splatting the random seed vector:
http://ftp.heanet.ie/mirrors/cran.r-project.org/web/packages/doRNG/index.html
Upvotes: 5