Reputation: 13
I just start to use the foreach and %dopar% methodes for parallel processing in R , but the results I'm getting are confusing and not the same as a for loop; here is the code I used to test those methodes and resultes I'm getting:
library(plyr); library(doParallel); library(foreach)
cs <- makeCluster(2)
registerDoParallel(cs)
sfor_start <- Sys.time()
s_for=as.numeric()
for (i in 1:1000) {
s_for[i] = sqrt(i)
}
print(Sys.time() - sfor_start)
sdopar_start <- Sys.time()
sdopar=as.numeric()
foreach(k=1:1000) %dopar% {
sdopar[k] = sqrt(k)
}
print(Sys.time() - sdopar_start)
And here the results:
> s_for[1:10]; sdopar[1:10]
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427 3.000000 3.162278
[1] NA NA NA NA NA NA NA NA NA NA
Thanks in advance :)
Upvotes: 0
Views: 9352
Reputation: 11738
Please read the documentation of functions before saying that they don't work.
foreach
works more like a lapply
than a for
-loop.
So, for example, foreach(k=1:1000) %dopar% sqrt(k)
gives the same result as lapply(1:1000, sqrt)
.
Yet, it is true that you can modify global variable when using foreach
SEQUENTIALLY. Yet, when using parallelism, the vector sdopar
is copied to each "cluster" so that you modify a copy, not the initial object.
So, you'll have to do as mentioned by @ChiPak with option .combine = c
or using do.call(sdopar, c)
afterwards.
PS: Always initialize the vector you fill iteratively (for efficiency of not growing a vector), for example like this: s_for <- double(1000)
.
Upvotes: 8