Can I speedup my R code with Rcpp?

Question

I defined an R function that contains a matrix, a vector and a parameter a. I need to compute the results of the function for different values of a. This is simple to code in R, but very slow when the matrix is "big" and number of the parameter values are large.

Can I define the function in R and do the for loop in Rcpp?

Can it speed up the computations?

A minimal example of a foo function in R is

f <- function(X,y,a){
  p = ncol(X)
  res = (crossprod(X) + a*diag(1,p))%*%crossprod(X,y)
  }

set.seed(0)
X <- matrix(rnorm(50*5),50,5)
y <- rnorm(50)
a <- seq(0,1,0.1)

result <- matrix(NA,ncol(X),length(a))

for(i in 1:length(a)){                  # Can I do this part in Rcpp?
  result[,i] <- f(X,y,a[i])
  }

result

Ralf Stubner · Accepted Answer

The answer by 李哲源 correctly identifies what should be done in your case. As for your original question the answer is two-fold: Yes, you can move the loop to C++ with Rcpp. No, you won't gain performance:

#include 

// [[Rcpp::export]]
Rcpp::NumericMatrix fillMatrix(Rcpp::NumericMatrix X,
                   Rcpp::NumericVector y,
                   Rcpp::NumericVector a,
                   Rcpp::Function f) {
  Rcpp::NumericMatrix result = Rcpp::no_init(X.cols(), a.length());
  for (int i = 0; i < a.length(); ++i) {
    result(Rcpp::_, i) = Rcpp::as(f(X, y, a[i]));
  }
  return result;
}

/*** R
f <- function(X,y,a){
  p = ncol(X)
  res = (crossprod(X) + a*diag(1,p))%*%crossprod(X,y)
  }

X <- matrix(rnorm(500*50),500,50)
y <- rnorm(500)
a <- seq(0,1,0.01)

system.time(fillMatrix(X, y, a, f))
#   user  system elapsed 
#  0.052   0.077   0.075 
system.time({result <- matrix(NA,ncol(X),length(a))

for(i in 1:length(a)){
  result[,i] <- f(X,y,a[i])
  }
})
#   user  system elapsed 
#  0.060   0.037   0.049 
*/

So the Rcpp solution is actually slower than the R solution in this case. Why? Because the real work is done within the function f. This is the same for both solutions, but the Rcpp solution has the additional overhead of calling back to R from C++. Note that for loops in R are not necessarily slow. BTW, here some benchmark data:

          expr      min       lq     mean   median       uq      max neval
 fillMatrixR() 41.22305 41.86880 46.16806 45.20537 49.11250 65.03886   100
 fillMatrixC() 44.57131 44.90617 49.76092 50.99102 52.89444 66.82156   100

Can I speedup my R code with Rcpp?

Answers (2)

Related Questions