aL_eX
aL_eX

Reputation: 1441

Parallelize function in R

I am pretty new to R, and have never attempted parallel computing before. I'm using the following functions to find perfect numbers (I know this is not the not the most efficient way of finding perfect numbers, since using Mersanne Primes is a lot faster):

factorlist <- function(n) {
  if(n < 1) {return("Invalid input: n must be a natural number")}
  if(n%%1 != 0) {return("Invalid input: n must be a natural number")}
  div <- 0
  if(n%%2 != 0) {
    for(i in seq(1, n-1, by = 2)) {
      if(n%%i == 0) {
        div[length(div)] <- i
        div <- c(div, 0)
      }
    }
    div <- div[-length(div)]
    return(div)
  }
  if(n%%2 == 0) {
    for(i in seq(1, n-1)) {
      if(n%%i == 0) {
        div[length(div)] <- i
        div <- c(div, 0)
      }
    }
    div <- div[-length(div)]
    return(div)
  }
}

perfectcheckN <- function(n) {
  if(n < 1) {return("Invalid input: n must be a natural number")}
  if(n%%1 != 0) {return("Invalid input: n must be a natural number")}
  start.time <- Sys.time()
  perf <- 0
  for(i in 6:n) {
    div <- factorlist(i)
    num <- 0
    for(j in 1:length(div)) {
      num <- num + div[j]
    }
    if(num == i) {
      perf[length(perf)] <- num
      perf <- c(perf, 0)
    }
  }
  perf <- perf[-length(perf)]
  end.time <- Sys.time()
  time.taken <- end.time - start.time
  return(cat(perf, "Time taken:", time.taken, "seconds."))
}

My CPU has 12 threads, and running this code normally uses a single thread and takes a very long time with large inputs. Is there a way to parallelize the code to make it run faster? Thanks in advance

Upvotes: 1

Views: 278

Answers (1)

Matt
Matt

Reputation: 14541

Initial Approach

One simple approach without much code changes would be to use parallel::mclapply assuming you are not on Windows (if you are look at parallelsugar::mclapply for a Windows implementation). However this is not ideal since it pre-schedules the numbers into buckets for each thread. For example if we have the number 40000 on a 4 core machine the buckets would be something like:

  • the first 6:10000
  • the second 10001:20000
  • the third 20001:30000
  • the fourth 30001:40000

The first 3 threads will finish before the last one. However, this will be faster than your original. A single run using my 4 core non-hyperthreaded Intel i5 3570K gives me the following results:

perfectcheckN(10000)

mclapply version: 6 28 496 8128 Time taken: 5.015653 seconds.
original version: 6 28 496 8128 Time taken: 14.84113 seconds.
speedup: 2.96x

Code:

library(parallel)

factorlist <- function(n) {
  if(n < 1) {return("Invalid input: n must be a natural number")}
  if(n%%1 != 0) {return("Invalid input: n must be a natural number")}
  div <- 0
  if(n%%2 != 0) {
    for(i in seq(1, n-1, by = 2)) {
      if(n%%i == 0) {
        div[length(div)] <- i
        div <- c(div, 0)
      }
    }
    div <- div[-length(div)]
    return(div)
  }
  if(n%%2 == 0) {
    for(i in seq(1, n-1)) {
      if(n%%i == 0) {
        div[length(div)] <- i
        div <- c(div, 0)
      }
    }
    div <- div[-length(div)]
    return(div)
  }
}

perfectcheckN <- function(n) {
  if(n < 1) {return("Invalid input: n must be a natural number")}
  if(n%%1 != 0) {return("Invalid input: n must be a natural number")}
  start.time <- Sys.time()

  perf <- mclapply(6:n, mc.cores = detectCores(), FUN = function(i) {
    div <- factorlist(i)
    num <- 0
    for(j in 1:length(div)) {
      num <- num + div[j]
    }
    if(num == i) {
      return(num)
    }
    return(NA)
  })
  perf <- t(as.data.frame(perf[!is.na(perf)]))
  perf <- c(perf, 0)
  perf <- perf[-length(perf)]
  end.time <- Sys.time()
  time.taken <- end.time - start.time
  return(cat(perf, "Time taken:", time.taken, "seconds."))
}

A Better Approach

mclapply has a mc.preschedule option we can set to F but this won't help since it does each integer per thread (instead of a range). However we can randomly sample 6:n to have all threads run for about the same time (and we'll need to sort perf to get the same order). Anyway, this will help a great deal. Beyond this, I'd look into optimizing the way you wrote the R code, but that isn't the question you were asking (but it too would speed things up).

perfectcheckN(10000)

mclapply version: 6 28 496 8128 Time taken: 4.393636 seconds.
original version: 6 28 496 8128 Time taken: 14.84113 seconds.
speedup: 3.38x

Updated Code:

library(parallel)

factorlist <- function(n) {
  if(n < 1) {return("Invalid input: n must be a natural number")}
  if(n%%1 != 0) {return("Invalid input: n must be a natural number")}
  div <- 0
  if(n%%2 != 0) {
    for(i in seq(1, n-1, by = 2)) {
      if(n%%i == 0) {
        div[length(div)] <- i
        div <- c(div, 0)
      }
    }
    div <- div[-length(div)]
    return(div)
  }
  if(n%%2 == 0) {
    for(i in seq(1, n-1)) {
      if(n%%i == 0) {
        div[length(div)] <- i
        div <- c(div, 0)
      }
    }
    div <- div[-length(div)]
    return(div)
  }
}

perfectcheckN <- function(n) {
  if(n < 1) {return("Invalid input: n must be a natural number")}
  if(n%%1 != 0) {return("Invalid input: n must be a natural number")}
  start.time <- Sys.time()

  perf <- mclapply(sample(6:n), mc.cores = detectCores(), FUN = function(i) {
    div <- factorlist(i)
    num <- 0
    for(j in 1:length(div)) {
      num <- num + div[j]
    }
    if(num == i) {
      return(num)
    }
    return(NA)
  })
  perf <- t(as.data.frame((perf[!is.na(perf)])))
  perf <- c(perf, 0)
  perf <- perf[-length(perf)]
  end.time <- Sys.time()
  time.taken <- end.time - start.time
  return(cat(sort(perf), "Time taken:", time.taken, "seconds."))
}

Upvotes: 2

Related Questions