Reputation: 473
On certain machines loading packages on all cores eats up all available RAM resulting in an error 137 and my R session is killed. On my laptop (Mac) and a Linux computer it works fine. On the Linux computer that I want to run this on, a 32 core with 32 * 6GB RAM it does not. The sysadmin told me memory is limited on the compute nodes. However, as per my edit below, my memory requirements are not excessive by any stretch of the imagination.
How can I debug this and find out what is different? I am new to the parallel
package.
Here is an example (it assumes the command install.packages(c(“tidyverse”,”OpenMx”))
has been run in R under version 4.0.3):
I also note that it seems to be only true for the OpenMx
and the mixtools
packages. I excluded mixtools
from the MWE because OpenMx
is enough to generate the problem. tidyverse
alone works fine.
A workaround I tried was to not load packages on the cluster and just evaluate .libPaths("~/R/x86_64-pc-linux-gnu-library/4.0/")
in the body of expr
of clusterEvalQ
and use the namespace commands like OpenMx::vec
in my functions but that produced the same error. So I am stuck because on two out of three machines it worked fine, just not on the one I am supposed to use (a compute node).
.libPaths("~/R/x86_64-pc-linux-gnu-library/4.0/")
library(parallel)
num_cores <- detectCores()
cat("Number of cores found:")
print(num_cores)
working_mice <- makeCluster(num_cores)
clusterExport(working_mice, ls())
clusterEvalQ(working_mice, expr = {
library("OpenMx")
library("tidyverse")
})
It seems to consume all available RAM resulting in an error 137 by simply loading packages. That is a problem because I need the libraries loaded in each available core where their functions are performing tasks.
Subsequently, I am using DEoptim
but loading packages was enough to generate the error.
I have profiled the code using profmem
and found that the part in the example code asks for about 2MB of memory and the whole script I am trying to run 94.75MB. I then also checked using my OS (Catalina) and caught the following processes as seen on the screenshot.
None of these numbers strike me as excessive, especially not on a node that has ~6GB per CPU and 32 cores. Unless, I am missing something major here.
Upvotes: 6
Views: 1513
Reputation: 2071
I want to start by saying I'm not sure what is causing this issue for you. The following example may help you debug how much memory is being used by each child processes.
Using mem_used
from the pryr
package will help you track how much RAM is used by an R session. The following shows the results of doing this on my local computer with 8 Cores and 16 GB of RAM.
library(parallel)
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
num_cores <- detectCores()
cat("Number of cores found:")
#> Number of cores found:
print(num_cores)
#> [1] 8
working_mice <- makeCluster(num_cores)
clusterExport(working_mice, ls())
clust_memory <- clusterEvalQ(working_mice, expr = {
start <- pryr::mem_used()
library("OpenMx")
mid <- pryr::mem_used()
library("tidyverse")
end <- pryr::mem_used()
data.frame(mem_state = factor(c("start","mid","end"), levels = c("start","mid","end")),
mem_used = c(start, mid, end), stringsAsFactors = F)
})
to_GB <- function(x) paste(x/1e9, "GB")
tibble(
clust_indx = seq_len(num_cores),
mem = clust_memory
) %>%
unnest(mem) %>%
ggplot(aes(mem_state, mem_used, group = clust_indx)) +
geom_line(position = 'stack') +
scale_y_continuous(labels = to_GB) #approximately
As you can see, each process uses about the same amount of RAM ~ 160MG on my machine. According to pryr::mem_used()
, the amount of RAM used is always the same per core after each library
step.
In whatever environment you are working in, I'd recommend you do this on just 10 workers and see if it is using a reasonable amount of memory.
I also confirmed with htop
top that all the child processes are only using about 4.5 GB of Virtual memory and approximately a similar amount of RAM each.
The only thing I can think of that may be the issue is clusterExport(working_mice, ls())
. This would only be an issue if you are not doing this in a fresh R session. For example, if you had 5 GB of data sitting in your global environment, each socket would be getting a copy.
Upvotes: 2