Can Eigen vector and matrix classes be used directly within an RcppParallel worker?

Question

I am fitting a statistical model whose likelihood and gradient evaluation depend on many complicated matrix operations, for which the Eigen library is very well-suited. I am working within Rcpp and having great success with RcppEigen. My likelihood/gradient contain a loop over the data size that is, in theory, trivially parallelizable. I would like to parallelize my likelihood computation using RcppParallel.

RcppParallel requires input of a Rcpp::NumericVector or Rcpp::NumericMatrix. I am able to transform between Eigen::MatrixXd and Rcpp::NumericMatrix thanks to a very helpful post here. In my example below, I reproduce one of the RcppParallel examples using this conversion. However, I find the performance hit of doing this appears too severe to make parallel computations worthwhile.

My questions are as follows:

Can an RcppParallel worker take/operate on an Eigen::MatrixXd directly?
Can anybody suggest an alternative method of parallelizing a loop involving a significant amount of Eigen-specific matrix algebra?

Some further information:

Re-writing my entire program to not use Eigen is very undesirable.
My local machine is an M1 Mac; I have access to servers running Ubuntu so can move to that environment if that might help.

Here is an example from the RcppParallel documentation, and a modification with Eigen classes that does compile and run and give the right answer, but incurs a substantial performance hit. My desired outcome is to be able to input and output the Eigen classed objects, without incurring this performance hit-- either with RcppParallel or some other method of parallelization.

Note that this case doesn't even cover what I actually want, which is to be able to use Eigen::MatrixXd etc within the RcppParallel Worker. The RcppParallel documentation appears to explicitly advise against doing this type of conversion within the Worker: "The code that you write within parallel workers should not call the R or Rcpp API in any fashion. ". Nonetheless, even the below incurs a performance hit, which has me wondering about the overall feasibility of what I would like to do here.

#include 
#include 
// [[Rcpp::depends(RcppEigen)]]
using namespace Rcpp;
// [[Rcpp::depends(RcppParallel)]]
#include 
using namespace RcppParallel;

#include 


// Square root all the elements in a big matrix
struct SquareRoot : public Worker {
    const RMatrix input;
    RMatrix output;

    // Constructor
    SquareRoot(const NumericMatrix input, NumericMatrix output) : input(input), output(output) {}
    // Call operator
    void operator()(std::size_t begin,std::size_t end) {
        std::transform(input.begin()+begin,
                       input.begin()+end,
                       output.begin()+begin,
                       (double(*)(double)) sqrt);
    }
};


// [[Rcpp::export]]
NumericMatrix parallelMatrixSqrt(NumericMatrix x) {
    NumericMatrix output(x.nrow(),x.ncol());

    SquareRoot mysqrt(x,output);
    parallelFor(0,x.length(),mysqrt,100);

    return output;
}

// Try it with an Eigen input and output, compare performance
// [[Rcpp::export]]
Eigen::MatrixXd parallelMatrixSqrtEigen(Eigen::MatrixXd x) {
    // Convert to NumericMatrix
    SEXP s = wrap(x);
    NumericMatrix output(x.rows(),x.cols());
    NumericMatrix xR(s);
    // Do the parallel computations
    SquareRoot mysqrt(xR,output);
    parallelFor(0,xR.length(),mysqrt,100);
    // Convert back
    Eigen::MatrixXd s2 = as(output);

    return s2;
}

Performance:

codepath <- "~/work/projects/misc/rcppparallel"
codefile <- "try-rcppparallel.cpp"

library(Rcpp)
library(RcppParallel)

sourceCpp(file.path(codepath,codefile))

n <- 1000
mm <- matrix(rnorm(n*n)^2,n,n)
microbenchmark::microbenchmark(
    sqrt(mm),
    parallelMatrixSqrt(mm),
    parallelMatrixSqrtEigen(mm)
)

Results:

Unit: microseconds
                        expr      min        lq      mean    median       uq      max neval cld
                    sqrt(mm) 1179.242 1294.6775 1692.3123 1384.7955 1464.213 3581.924   100  b 
      parallelMatrixSqrt(mm)  321.973  496.3665  909.3078  571.0685  784.617 3582.785   100 a  
 parallelMatrixSqrtEigen(mm) 3223.133 3504.1675 4422.4277 4071.7715 5313.190 6737.735   100   c

A major thank you in advance!

Can Eigen vector and matrix classes be used directly within an RcppParallel worker?

Answers (0)

Related Questions