Reputation: 8733
How many list elements are sent to each worker process when calling parLapply()? For example, let's say we have a list of 6 elements and 2 workers on a snow SOCK cluster. Does parLapply() sends two list elements to each worker in one send call, or does it send one element per send?
I want to minimize my cluster communication overhead (I have many list elements that can be processed relatively quickly by each CPU) and from what I see on the htop CPU meters it looks like snow it's sending one list element at the time. Is it possible to set the number of list elements dispatched in one send call?
Upvotes: 2
Views: 307
Reputation: 19677
The parLapply
function splits the input into one chunk per worker. It does that with the splitList
function, as seen in the implentation of parLapply
:
function (cl = NULL, X, fun, ...)
do.call(c, clusterApply(cl, x = splitList(X, length(cl)), fun = lapply,
fun, ...), quote = TRUE)
So with a list of 6 elements and 2 workers, it will send 3 elements to each worker with a single "send" operation per worker. This is similar to the behavior of mclapply
with mc.preschedule
set to TRUE
(the default value).
So it seems that parLapply
is already performing the optimization that you want.
It's interesting to note that by simply changing lapply
to mclapply
in the definition of parLapply
, you can create a hybrid parallel programming function that might work quite well with nodes that have many cores.
Upvotes: 5