Using Rcpp functions inside of R's par*apply functions from the parallel package

Question

I'm trying to understand what is happening behind the Rcpp::sourceCpp() call on a parallelized environment. Recently, this was partially addressed in the question: Using Rcpp function in parLapply on Windows.

Within this post, Dirk said,

"You need to run the sourceCpp() call in each spawned process, or else get them your code."

This was in response to questioner's use of distributing the Rcpp function to the worker processes. The questioner was sending the Rcpp function via:

clusterExport(cl = cl, varlist = "payoff")

I'm confused as to why this doesn't work. My thoughts are that this was what the objective of the clusterExport() is for.

coatless · Accepted Answer

The issue here is that the compiled code is not "exportable" to the spawned processes without being embedded in a package due to how binaries are linked into R's processes.

Traditionally, the clusterExport() statement allows for R specific code to be distributed to workers.

By using clusterExport() on an Rcpp function, you are only receiving the R declaration and not the underlying shared library. That is to say, the R CMD SHLIB given in Attributes.R is not shared with / exported to the workers. As a result, when a call is then made to an Rcpp function on the worker, R cannot find the correct shared library.

Take the previous question's function:

Rcpp::cppFunction("NumericVector payoff( double strike, NumericVector data) {
    return pmax(data - strike, 0);
}")

Note: I'm using cppFunction() instead of sourceCpp() but the results are equivalent since cppFunction() calls sourceCpp() to create the function.

Typing the function name:

payoff

Yields the R declaration with a shared library pointer.

function (strike, data) 
.Primitive(".Call")(, strike, data)

This shared library is only available on process that compiled the function.

Hence, why it is always ideal to embed compiled code within a package and then distribute the package.

Using Rcpp functions inside of R's par*apply functions from the parallel package

Answers (1)

Related Questions

Using Rcpp functions inside of R&#39;s par*apply functions from the parallel package

Answers (1)

Related Questions

Using Rcpp functions inside of R's par*apply functions from the parallel package