Seb
Seb

Reputation: 5497

Parallelizing a for-loop depending on previous outcomes

I'm just taking my first steps with the package parallel and the foreach() function. therefore this question will be rather stupid. so here's my task i want to fullfil:

  1. apply function to an object t
  2. use the outcome in the next iteration

a simple example would be:

newFunc<-function(){
    test[i+1] <<- sqrt(test[i])
}

test <- c(1,rep(NA, 10))

foreach(i=1:11, .combine='rbind', .export='test')%do% newFunc()

this yields me a vector of ones as a for-loop would of course as well. however, if i try parallelise this this yields a different outcome:

test <- c(1,rep(NA, 10))

library(doParallel)
library(foreach)
cl <- makeCluster(4)
registerDoParallel(cl)

foreach(i=1:11, .combine='rbind', .export='test')%dopar% newFunc()

stopCluster(cl)

this leaves me with the output c(1, NA, NA, NA, NA, ..., NA). I guess this is because the slaves don't know the result of other functions? I hope I supplied the necessary information. My actual function is of course more complex, yet this example seemed the easiest way do demonstrate my problem.

edit: I guess the first question is: can such a problem be parallelised at all?

Upvotes: 2

Views: 392

Answers (1)

Martin Morgan
Martin Morgan

Reputation: 46856

It's true that the iteration as posed does not run in parallel, but

test = numeric(11); test[] = 2
test^(1/2^(0:10))

is the solution you're interested in. This is easy (though unnecessary, since the calculation is vectorized already) to parallelize

fun = function(i, test)
    test[i] ^ (1 / 2^(i - 1))
unlist(mclapply(seq_along(test), fun, test))

For parallel evaluation, fun should not update a non-local variable (as you do with <<-). Maybe your real problem can be posed in a way that allows it to be evaluated in parallel, even if the original formulation seems to require sequential evaluation?

Upvotes: 2

Related Questions