Reputation: 41
Suppose I have a function f()
and a vector d
f <- function(x) dexp(x, 2)
d <- runif(10, 1, 5)
Now I want to perform a for loop like
dnew <- numeric(length(d))
for (i in seq_along(dnew)){
dnew[i] <- f(d[i])
}
. How can I do this in parallel?
Upvotes: 1
Views: 7921
Reputation: 2556
The example code is faster without for loop:
dnew2 <- f(d) # 'f()' and 'd' from question
all.equal(dnew, dnew2) # 'dnew' from question
[1] TRUE
library(microbenchmark)
microbenchmark('for loop' = for (i in seq_along(dnew)){ dnew[i] <- f(d[i]) },
'vectorized' = { dnew2 = f(d) })
Unit: microseconds
expr min lq mean median uq max neval
for loop 15.639 16.4455 17.66640 17.0045 18.089 43.938 100
vectorized 1.249 1.3140 1.44039 1.3845 1.516 2.424 100
It can be parallelized with foreach:
library(foreach)
library(doParallel); registerDoParallel(2)
dnew3 <- foreach(i=seq_along(dnew), .combine=c) %dopar% {
f(d[i])
}
all.equal(dnew, dnew3)
[1] TRUE
The paralleized version is slower, because the parallel overhead is larger than the benefit.
microbenchmark('for loop' = for (i in seq_along(dnew)){ dnew[i] <- f(d[i]) },
'foreach' = { dnew3 <- foreach(i=seq_along(dnew), .combine=c) %dopar% {
f(d[i]) }
})
Unit: microseconds
expr min lq mean median uq max neval
for loop 17.799 22.048 31.01027 32.7615 37.0945 67.265 100
foreach 11875.845 13003.558 13576.64759 13427.1015 14041.3455 17782.638 100
If f()
takes more time to be evaluated, the foreach version is faster:
f <- function(x){
Sys.sleep(.3)
dexp(x, 2)
}
microbenchmark('for loop' = for (i in seq_along(dnew)){ dnew[i] <- f(d[i]) },
'foreach' = {dnew3 <- foreach(i=seq_along(dnew), .combine=c) %dopar% {
f(d[i]) }
}, times=2)
Unit: seconds
expr min lq mean median uq max neval
for loop 3.004271 3.004271 3.004554 3.004554 3.004837 3.004837 2
foreach 1.515458 1.515458 1.515602 1.515602 1.515746 1.515746 2
Upvotes: 4
Reputation: 1086
Simple for loop
a <- function(x) {dexp(x,2)}
d<- runif(10,1,5)
d
dnew < - numeric(length(d))
for (i in 1: length(dnew)){
dnew[i]<- a(d[i])
}
dnew
Parallel version
library(doParallel)
dnew < - numeric(length(d))
no_cores <- detectCores() - 1
registerDoParallel(cores=no_cores)
cl <- makeCluster(no_cores, type="FORK")
dnew <- clusterApply(cl=cl, x=d, fun = a)
stopCluster(cl)
dnew
Take a look to this blog's post: https://www.r-bloggers.com/lets-be-faster-and-more-parallel-in-r-with-doparallel-package/
Hope it helps!
Upvotes: 1