Reputation: 1589
This question is related to this one, where I was asking how to replicate a user-defined function. Now I would like to parallelize the operations in order to save time. What I have preliminarly done is:
I have defined a custom function my.fun()
, which returns output
, a matrix with 1000
rows and 20
columns.
I replicate say 5
times output
, and store the results in a single matrix called final
through: final <- do.call(rbind, replicate(5, my.fun(), simplify=FALSE))
. Hence, in this example final
is a 5000
-rows matrix.
What I would like to do now is to parallelize the 5 (or even more..) output
replications before binding the results in the final
matrix.
How would you do that? What I have (wrongly) done so far is:
library(snowfall)
sfInit(parallel = TRUE, cpus = 4, type = "SOCK")
# previously defined objects manipulated within my.fun
sfExport(...)
my.fun = function() {
...
return(output)
}
final <- do.call(rbind, sfSapply(1:5, fun=my.fun(), simplify=FALSE))
sfStop()
but it returns:
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'fun' of mode 'function' was not found
Any help would be greatly appreciated! Please, consider that I do not necessairly want to use -snowfall-
: the final goal is to parallelize the computation of final
in an efficient way (in reality I have to make a lot of replications..).
Upvotes: 0
Views: 924
Reputation: 121626
I don't have any experience with parallel computing in R.
I had to add a dummy argument to the function my.func, otherwise sfSapply
complains with this error
first error: unused argument(s) (X[[1]])
So I add x as argument
my.fun <- function(x) matrix(1:4, 2,2)
Now I tried to benchmark the parallel and the sapply
solution
sfInit(parallel = TRUE, cpus = 4)
library(rbenchmark)
benchmark(
pp = sfSapply(1:20000, fun=my.fun, simplify=FALSE),
nopp = sapply(1:20000, FUN=my.fun, simplify=FALSE))
The parallel solution is slower than the classic one!! I am really confusing. maybe others more experienced with R paraelle computing can give us a logic explanation..
test replications elapsed relative user.self sys.self user.child sys.child
2 nopp 100 15.22 1.000 13.90 0.02 NA NA
1 pp 100 27.28 1.792 11.95 2.04 NA NA
Upvotes: 1
Reputation: 14093
sfSapply
expects fun
to be a function, but you hand over the result of one call to my.fun
. That is, you want to hand over my.fun
, not my.fun ()
.
Upvotes: 3