Reputation: 283
I am trying to use foreach
to parallelize some matrix calculations. However, I found it gives similar performance to apply
and for
loop, despite I used 10 cores.
library(microbenchmark)
library(foreach)
library(doParallel)
mat = matrix(rnorm(3000 * 3000), 3000) # 3000 X 3000
g = rnorm(3000) # 3000
# use 10 cores
cl <- makeCluster(10)
registerDoParallel(cl)
microbenchmark(
# multi-core with foreach
foreach(i = 1:100, .combine = c) %dopar% {t(g) %*% mat %*% g},
# sapply
sapply(1:100, function(i){t(g) %*% mat %*% g}),
# for loop
for(i in 1:100){t(g) %*% mat %*% g},
times = 20)
stopCluster(cl)
On the other hand, if I use a different function (Sys.sleep()
), foreach
indeed can be ~10X faster than apply
and for
loop.
cl <- makeCluster(10)
registerDoParallel(cl)
microbenchmark(
# multi-core with foreach
foreach(i = 1:100, .combine = c) %dopar% {Sys.sleep(0.01) },
# sapply
sapply(1:100, function(i){Sys.sleep(0.01)}),
# for loop
for(i in 1:100){Sys.sleep(0.01)},
times = 20)
stopCluster(cl)
What is the reason, and how can I improve the performance for matrix calculation?
Upvotes: 0
Views: 97
Reputation: 283
@Dave2e proposed that I should increase the number of iterations. Indeed, if I have
cl <- makeCluster(10)
registerDoParallel(cl)
microbenchmark(
# multi-core with foreach
foreach(i = 1:3000, .combine = c) %dopar% {t(g) %*% mat %*% g},
# sapply
sapply(1:3000, function(i){t(g) %*% mat %*% g}),
# for loop
for(i in 1:3000){t(g) %*% mat %*% g},
times = 5)
stopCluster(cl)
Then foreach
uses 5s, compare to 32s for apply
and loop
Upvotes: 1