Force parLapply to do jobs in order

Question

I have a sequence of jobs that I would like to parallelize as close to in order as possible (I know that some nodes will complete before others).

My current script looks like:

library(parallel)

cl <- makeCluster(12)

iseq <- seq(1, 10000, 1)

results <- unlist(parLapply(cl, iseq, function(y){
    write(y, "progress.txt", append=T)
}
))

stopCluster(cl)

The values in progress.txt are wildly out of order. 1, 826, 2, 3, 827 and so on.

Steve Weston · Accepted Answer

If you use clusterApply, then your results won't be as wildly out of order:

results <- unlist(clusterApply(cl, iseq, function(y){
    write(y, "progress.txt", append=T)
}
))

parLapply assigns the tasks to the workers in chunks, so in your case, the first task assigned to one of your workers is task 826. clusterApply uses round-robin scheduling, so it can't get too badly out of order.

Force parLapply to do jobs in order

Answers (1)

Related Questions