Reputation: 299
I want to use a foreach loop on a Windows machine to make use of multiple cores in cpu heavy computation. However, I cannot get the processes to do any work.
Here is a minimal example of what I think should work, but doesn't:
library(snow)
library(doSNOW)
library(foreach)
cl <- makeSOCKcluster(4)
registerDoSNOW(cl)
pois <- rpois(1e6, 1500) # draw 1500 times from poisson with mean 1500
x <- foreach(i=1:1e6) %dopar% {
runif(pois[i]) # draw from uniform distribution pois[i] times
}
stopCluster(cl)
SNOW does create the 4 "slave" processes, but they don't do any work:
I hope this isn't a duplicate, but I cannot find anything with the search terms I can come up with.
Upvotes: 1
Views: 373
Reputation: 19677
Although this particular example isn't worth executing in parallel, it's worth noting that since it uses doSNOW, the entire pois
vector is auto-exported to all of the workers even though each worker only needs a fraction of it. However, you can avoid auto-exporting any data to the workers by iterating over pois
itself:
x <- foreach(p=pois) %dopar% {
runif(p)
}
Now the elements of pois
are sent to the workers in the tasks, so each worker only receives the data that's actually needed to perform its tasks. This technique isn't important when using doMC, since the doMC workers get pois
for free.
You can also often improve performance enormously by processing pois
in larger chunks using an iterator function such as "isplitVector" from the itertools package.
Upvotes: 3
Reputation: 132999
It's probably working (at least it does on my mac). However, one call to runif
takes such a small amount of time that all the time is spent for the overhead and the child processes spend negligible CPU power with the actual tasks.
x <- foreach(i=1:20) %dopar% {
system.time(runif(pois[i]))
}
x[[1]]
#user system elapsed
# 0 0 0
Parallelization makes sense if you have some heavy computations that cannot be optimized. That's not the case in your example. You don't need 1e6 calls to runif
, one would be sufficient (e.g., runif(sum(pois))
and then split the result).
PS: Always test with a smaller example.
Upvotes: 3