Joseph Yu
Joseph Yu

Reputation: 133

My functions in R package using Rcpp and snowfall are slower than not in the package

I am developing an R package that uses mainly Rcpp, RcppArmadillo and snowfall (for parallel computing). It passed both `devtools::check' and 'devtools::check_rhub()'. However, I noticed that my code is slower in the package and would like to close the gap between the two. For example,

This is the system.time() result for running my code in the package.

user system elapsed
18.72 31.12 135.56

On the other hand, here is the system.time() result running my code normally outside of the package.

user system elapsed
4.23 2.98 102.18

In addition, I also profiled (profvis) both and noticed a significantly longer time for sfLapply(for parallel computing) in the package environment. Here I provide a short summary of my code structure between the two. If needed, I'll add a link to my github code page.

Code in package

The cpp functions are built into a DLL.

foo = function(...,core=10){

  sfInit(parallel=TRUE,cpus=core) ## set number of cores for parallel computing
  sfClusterSetupRNG( type="RNGstream",seed=cluster_seed) ## set cluster seed for reproducibility

  ## read inputs and set parameters
   .
   .
   .
  sfExport(list=ls()) ##exporting parameters to cores

  node = 0:(core-1)  

  print(system.time({

  for(i in 1:iter){

       ## update_a and update_b are cpp functions built in DLL using "build and reload"
       a = sfLapply(node, update_a, a, b,...) 

       sfExport("a")
       
       b = sfLapply(node, update_b, a, b,...) 

       sfExport("b")
         .
         .
         .
   }
   }))
   return(...)
}

Namespace:

export(foo)
import(RcppArmadillo)
import(rlecuyer)
import(snow)
import(rlecuyer)
importFrom("snowfall",...)
importFrom("stats","rgamma")
importFrom("stats","runif")
importFrom(Rcpp,sourceCpp)
useDynLib(mypackage, .registration=TRUE)

Code outside of package

The main difference from above is that I need to:

  1. Export the dependent the package to clusters/cores through sfLibrary("Rcpp", character.only=TRUE) and sfLibrary("RcppArmadillo", character.only=TRUE).
  2. Parses my cpp code via sourceCpp(code = RcppCode) and then load it to the clusters/cores via sfClusterEval(sourceCpp(code = RcppCode)) where RcppCode contains all my cpp code as follows,
RcppCode = '
#include <RcppArmadillo.h>
#include <Rmath.h>
using namespace Rcpp;
// Enable C++11 via this plugin (Rcpp 0.10.3 or later)
// [[Rcpp::plugins("cpp11")]]

//[[Rcpp::depends(RcppArmadillo)]]

.
.
.
// [[Rcpp::export]]
List update_a(...){
.
.
.
return List::create(...)
}
.
.
.
'

Session Info for the dependent packages and platform:

R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RcppArmadillo_0.10.1.2.0 rlecuyer_0.3-5           Rcpp_1.0.5               snowfall_1.84-6.1        snow_0.4-3              

loaded via a namespace (and not attached):
[1] compiler_4.0.3 parallel_4.0.3 tools_4.0.3 

Any suggestions on how to close the speed gap between the two are much appreciated. Thank you very much.

Upvotes: 0

Views: 148

Answers (0)

Related Questions