Hirek
Hirek

Reputation: 473

Excessive RAM usage of R's parallel package loading libraries

On certain machines loading packages on all cores eats up all available RAM resulting in an error 137 and my R session is killed. On my laptop (Mac) and a Linux computer it works fine. On the Linux computer that I want to run this on, a 32 core with 32 * 6GB RAM it does not. The sysadmin told me memory is limited on the compute nodes. However, as per my edit below, my memory requirements are not excessive by any stretch of the imagination.

How can I debug this and find out what is different? I am new to the parallel package.

Here is an example (it assumes the command install.packages(c(“tidyverse”,”OpenMx”)) has been run in R under version 4.0.3):

I also note that it seems to be only true for the OpenMx and the mixtools packages. I excluded mixtools from the MWE because OpenMx is enough to generate the problem. tidyverse alone works fine.

A workaround I tried was to not load packages on the cluster and just evaluate .libPaths("~/R/x86_64-pc-linux-gnu-library/4.0/") in the body of expr of clusterEvalQ and use the namespace commands like OpenMx::vec in my functions but that produced the same error. So I am stuck because on two out of three machines it worked fine, just not on the one I am supposed to use (a compute node).

.libPaths("~/R/x86_64-pc-linux-gnu-library/4.0/")
library(parallel)
num_cores <- detectCores()
cat("Number of cores found:") 
print(num_cores)
working_mice <- makeCluster(num_cores) 
clusterExport(working_mice, ls())
clusterEvalQ(working_mice, expr = {
  library("OpenMx")
  library("tidyverse")
  })

It seems to consume all available RAM resulting in an error 137 by simply loading packages. That is a problem because I need the libraries loaded in each available core where their functions are performing tasks.

Subsequently, I am using DEoptim but loading packages was enough to generate the error.

Edit

I have profiled the code using profmem and found that the part in the example code asks for about 2MB of memory and the whole script I am trying to run 94.75MB. I then also checked using my OS (Catalina) and caught the following processes as seen on the screenshot.

None of these numbers strike me as excessive, especially not on a node that has ~6GB per CPU and 32 cores. Unless, I am missing something major here.

Memory information

Upvotes: 6

Views: 1513

Answers (1)

Justin Landis
Justin Landis

Reputation: 2071

I want to start by saying I'm not sure what is causing this issue for you. The following example may help you debug how much memory is being used by each child processes.

Using mem_used from the pryr package will help you track how much RAM is used by an R session. The following shows the results of doing this on my local computer with 8 Cores and 16 GB of RAM.

library(parallel)
library(tidyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)
num_cores <- detectCores()
cat("Number of cores found:") 
#> Number of cores found:
print(num_cores)
#> [1] 8
working_mice <- makeCluster(num_cores) 
clusterExport(working_mice, ls())
clust_memory <- clusterEvalQ(working_mice, expr = {
  start <- pryr::mem_used()
  library("OpenMx")
  mid <- pryr::mem_used()
  library("tidyverse")
  end <- pryr::mem_used()
  data.frame(mem_state = factor(c("start","mid","end"), levels = c("start","mid","end")),
             mem_used = c(start, mid, end), stringsAsFactors = F)
})

to_GB <- function(x) paste(x/1e9, "GB")

tibble(
  clust_indx = seq_len(num_cores),
  mem = clust_memory
) %>%
  unnest(mem) %>% 
  ggplot(aes(mem_state, mem_used, group = clust_indx)) +
  geom_line(position = 'stack') +
  scale_y_continuous(labels = to_GB) #approximately

As you can see, each process uses about the same amount of RAM ~ 160MG on my machine. According to pryr::mem_used(), the amount of RAM used is always the same per core after each library step.

In whatever environment you are working in, I'd recommend you do this on just 10 workers and see if it is using a reasonable amount of memory.

I also confirmed with htop top that all the child processes are only using about 4.5 GB of Virtual memory and approximately a similar amount of RAM each.

The only thing I can think of that may be the issue is clusterExport(working_mice, ls()). This would only be an issue if you are not doing this in a fresh R session. For example, if you had 5 GB of data sitting in your global environment, each socket would be getting a copy.

Upvotes: 2

Related Questions