Daniel James
Daniel James

Reputation: 1433

How to Write R Package Documentation for a Function with Parallel Backend

I want to write this function as an R package

Edit

#' create suns package
#''
#' More detailed Description
#'
#' @describeIn This sums helps to
#'
#' @importFrom foreach foreach
#'
#' @importFrom doParallel registerDoParallel
#'
#' @param x Numeric Vector
#'
#' @importFrom doParallel `%dopar%`
#'
#' @importFrom parallel parallel
#'
#' @export
sums <- function(x){
plan(multisession)
n_cores <- detectCores()# check for howmany cores present in the Operating System
cl <- parallel::makeCluster(n_cores)# use all the cores pdectected
doParallel::registerDoParallel(cores  =  detectCores())

    ss <- function(x){
  `%dopar%` <- foreach::`%dopar%`
   foreach::foreach(i = x, .combine = "+") %dopar% {i}
     }
    sss <- function(x){
   `%dopar%` <- foreach::`%dopar%`
   foreach::foreach(i = x, .combine = "+") %dopar% {i^2}
}

ssq <- function(x){
   `%dopar%` <- foreach::`%dopar%`
   foreach::foreach(i = x, .combine = "+") %dopar% {i^3}
}

sums <- function(x, methods = c("sum", "squaredsum", "cubedsum")){

  output <- c()

  if("sum" %in% methods){
    output <- c(output, ss = ss(x))
  }

  if("squaredsum" %in% methods){
    output <- c(output, sss = sss(x))
  }

  if("cubedsum" %in% methods){
    output <- c(output, ssq = ssq(x))
  }

  return(output)
}

parallel::stopCluster(cl = cl)
x <- 1:10

sums(x)

.

What I Need

Assuming my vector x is such large that it will take a serial processing about 5 hours to complete the task like x <- 1:9e9 where parallel processing can help. How do I include:

n_cores <- detectCores()
#cl <- makeCluster(n_cores)
#registerDoParallel(cores  =  detectCores())

in my .R file and DESCRIPTION file such that it will be worthy of R package documentation?

Upvotes: 0

Views: 231

Answers (1)

Comevussor
Comevussor

Reputation: 318

Even if it is not very easy to see the scope of the question, I'll try to make relevent suggestions. I understand that you have problems running check on your package with examples/tests that use parallel computation.

  • First of all, remember that check uses CRAN standards and it is impossible in a CRAN package to run examples or tests that use more than 2 cores for compatibility reasons. So your examples must be simple enough to be dealt with by 2 cores.
  • Then there is a problem in your code as your create a cluster but don't use it in the doParallel
  • Next you are using in your piece of code parallel package and doParallel package, therefore they must be included in the DESCRIPTION file running in your console:
usethis::use_package("parallel")
usethis::use_package("doParallel")

This will add both packages in the "Imports" section of your description. And then your won't load these libraries explicitely in your package.

  • Then you should also clarify your function in your example using "::" after the name of the relevant package which would make your example look like:
    n_cores <- 2
    cl <- parallel::makeCluster(n_cores)
    doParallel::registerDoParallel(cl = cl)
    ...
    parallel::stopCluster(cl = cl)

You can also refer to the registerDoParallel documentation to get a similar piece of code, you will also find that it is limited to 2 cores.

To be complete, I do not think your really need foreach package since default parallelization in R is very powerful. If you want to be able to use your function with detectCores, I would suggest you add a limitint parameter. This function should do what you want in a more "R like" manner:

sums <- function(x, methods, maxcores) {
  n_cores <- min(maxcores,
                 parallel::detectCores())# check for howmany cores present in the Operating System
  cl <- parallel::makeCluster(n_cores)# use all the cores pdectected
  
  outputs <- sapply(
    X = methods,
    FUN = function(method) {
      if ("sum" == method) {
        output <- parallel::parSapply(
          cl = cl,
          X = x,
          FUN = function(i)
            i
        )
      }
      
      if ("squaredsum" == method) {
        output <-
          parallel::parSapply(
            cl = cl,
            X = x,
            FUN = function(i)
              i ** 2
          )
      }
      
      if ("cubedsum" == method) {
        output <-
          parallel::parSapply(
            cl = cl,
            X = x,
            FUN = function(i)
              i ** 3
          )
      }
      
      return(sum(output))
    }
  )
  
  parallel::stopCluster(cl = cl)
  
  return(outputs)
}


x <- 1:10000000

sums(x = x, c("sum", "squaredsum"), 2)

Upvotes: 1

Related Questions