Reputation: 8818
Consider the following mock example:
library(foreach)
library(doParallel)
cl <- makeCluster(3)
registerDoParallel(cl)
pdf("mypdf.pdf", width = 8, height = 8)
layout(matrix(c(1,2,3,4), nrow=2, byrow=TRUE), heights = c(1,1))
result <- foreach(i=1:10000) %dopar% {
if(i %in% c(5,10,15,20)) {plot(i)}
i + 2
}
dev.off()
This is what I am trying to do: for i
in 1:10000
, I want to return i+2
. And, if i
is equal to 5, 10, 15, 20, I want to plot the point i
to a pdf. I want all the plots (4 plots) to be in the same pdf.
With a simple for
loop, this works. However, with parallel computing, it doesn't seem to work.
Any ideas?
Thanks!
Upvotes: 2
Views: 1938
Reputation: 19667
Keep in mind that the cluster workers are completely separate R sessions executing in different processes. Your code works with a for loop because it executes within by a single process. With a foreach loop, you're only setting up the PDF graphics device driver in the master process while doing the plotting in the workers. Instead, you need to do something like this:
result <- foreach(i=1:10000) %dopar% {
if(i %in% c(5,10,15,20)) {
pdf(sprintf('task_%03d.pdf', i))
plot(i)
dev.off()
}
i + 2
}
Of course, this creates four separate plot files. If you want to merge them, you'll have to find a tool to do that and call it from the master either in the combine function or after the foreach loop exits. You may wish to create the plot files in a different format and convert the merged file to a PDF.
Upvotes: 4