Reputation: 5155
How to disable / suppress a popup window "Updating loaded packages" which keeps showing up during R
package installation? I am happy to have it set to "No", but I do not know how to make it work (investigated install.packages()
args and did my googling, but did not find out).
Background: my goal is to compare the installing time of a large (2k) collection of packages. I want to make it overnight in a loop where in each iteration: (1) I remove all but base
priority packages, (2) I measure the time of particular package installation. I must have no popup windows (which halt the process) to do this.
sessionInfo
when I start RStudio:
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1
>
Upvotes: 1
Views: 1729
Reputation: 78792
You should consider a benchmarking harness something akin to:
#!/bin/bash
# Create file of all installed packages
Rscript -e 'writeLines(unname(installed.packages()[,1]), "installed-pkgs.txt")'
# Iterate over the file, benchmarking package load 3x (consider bumping this up)
while read -r pkg; do
echo -n "Benchmarking package [${pkg}]"
for iter in {1..3}; do
echo -n "."
Rscript --vanilla \
-e 'args <- commandArgs(TRUE)' \
-e 'invisible(suppressPackageStartupMessages(xdf <- as.data.frame(as.list(system.time(library(args[1], character.only=TRUE), FALSE)))))' \
-e 'xdf$pkg <- args[1]' \
-e 'xdf$iter <- args[2]' \
-e 'xdf$loaded_namespaces <- I(list(loadedNamespaces()))' \
-e 'saveRDS(xdf, file.path("data", sprintf("%s-%s.rds", args[1], args[2])))' \
"${pkg}" \
"${iter}"
done
echo ""
done <installed-pkgs.txt
I made a ~/projects/pkgbench
directory with a data
subdir and put ^^ in ~/projects/pkgbench
. With it you:
When it runs (from a non-RStudio terminal session on your macOS box) you get progress (one dot per iteration):
$ ./pkgbench.sh
Benchmarking package [abind]...
Benchmarking package [acepack]...
Benchmarking package [AER]...
Benchmarking package [akima]...
You can then do something like (I killed the benchmark after just a few pkgs):
library(hrbrthemes) # github/gitlab
library(tidyverse)
map_df(
list.files("~/projects/pkgbench/data", full.names = TRUE),
readRDS
) %>% tbl_df() %>% print() -> bench_df
## # A tibble: 141 x 8
## user.self sys.self elapsed user.child sys.child pkg iter loaded_namespaces
## <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <list>
## 1 0.00500 0.00100 0.00700 0. 0. abind 1 <chr [9]>
## 2 0.00600 0.00100 0.00700 0. 0. abind 2 <chr [9]>
## 3 0.00600 0.00100 0.00600 0. 0. abind 3 <chr [9]>
## 4 0.00500 0.00100 0.00600 0. 0. acepack 1 <chr [9]>
## 5 0.00600 0.001000 0.00800 0. 0. acepack 2 <chr [9]>
## 6 0.00500 0.00100 0.00600 0. 0. acepack 3 <chr [9]>
## 7 1.11 0.0770 1.19 0. 0. AER 1 <chr [36]>
## 8 1.04 0.0670 1.11 0. 0. AER 2 <chr [36]>
## 9 1.07 0.0720 1.15 0. 0. AER 3 <chr [36]>
## 10 0.136 0.0110 0.147 0. 0. akima 1 <chr [12]>
## # ... with 131 more rows
group_by(bench_df, pkg) %>%
summarise(
med_elapsed = median(elapsed),
ns_ct = length(loaded_namespaces[[1]])
) -> bench_sum
ggplot(bench_sum, aes("elapsed", med_elapsed)) +
geom_violin(fill = ft_cols$gray) +
ggbeeswarm::geom_quasirandom(color = ft_cols$yellow) +
geom_boxplot(color = "white", fill="#00000000", outlier.colour = NA) +
theme_ft_rc(grid="Y")
ggplot(bench_sum, aes(ns_ct, med_elapsed)) +
geom_point(color = ft_cols$yellow) +
geom_smooth(color = ft_cols$peach) + # shld prbly use something better than loess
theme_ft_rc(grid = "XY")
If you are going to run it overnight, make sure you disable all "sleepy/idle" time things macOS might do to you (like disable any heavyweight screensavers, prevent it from putting disks to sleep, etc).
Note that I suppressed package startup messages from printing. You may want to capture.output()
instead or do a comparison with and without that.
library()
also has all these parameter options:
library(
package,
help,
pos = 2,
lib.loc = NULL,
character.only = FALSE,
logical.return = FALSE,
warn.conflicts = TRUE,
quietly = FALSE,
verbose = getOption("verbose")
)
You may want to tweak those for various benchmarking runs as well.
I also only looked at the median of "what the package load felt like to the user" value. Consider examining all of the system.time
values that are in the data frame.
If your Mac is sufficiently beefy CPU-core-wise and you have a fast solid state disk, you could consider using GNU parallel
with this harness to speed up the timings. I'd definitely use more than 3 iterations per-pkg if you do this and be fairly conservative with the number of concurrent parallel runs.
Upvotes: 2