user321627
user321627

Reputation: 2572

In R, is there a way I can benchmark and approximate the time it takes to save a 7 GB object using the saveRDS function?

In R, I currently have a 7 GB list object I need to save. Is there a way I can benchmark and approximate the time it takes to save the 7 GB object using using the saveRDS() with its defaults? (such as file compression) I have tried to approximate it but am unsure how to do so. I am on a computer with 16 cores (not sure if that makes a difference) and have 30 GB of RAM with a fast 3+ GHZ processor.

Thanks.

Upvotes: 1

Views: 74

Answers (1)

DJV
DJV

Reputation: 4873

I'm not sure if that's what you meant but you can use the 'rbenchmark' package (here a nice blog post on different ways to benchmark, including rbenchmark) .

I did some benchmarking with a 1.1GB list object.

library(rbenchmark)

Mylist <- list(a = data.frame(replicate(100000, sample(0:1, 1000, rep = TRUE))), 
               b = data.frame(replicate(100000, sample(0:1, 1000, rep = TRUE))), 
               c = data.frame(replicate(100000, sample(0:1, 1000, rep = TRUE))))

print(object.size(Mylist), units = "auto")

1.1 Gb

scores <- rbenchmark::benchmark("saveRDS_compress" = {
  saveRDS(Mylist, file = tempfile("mylist.rds"), compress = TRUE)
},
"saveRDS_not_compress" = {
  saveRDS(Mylist, file = tempfile("mylist.rds"), compress = FALSE)
}, 
"save_compress" = {
  save(Mylist, file = tempfile("mylist.rds"), compress = TRUE)
},
"save_not_compress" = {
  save(Mylist, file = tempfile("mylist.rds"), compress = FALSE)
},
"rlist::list.save_list.rds" = {
  rlist::list.save(Mylist, 'list.rds')
},
"rlist::list.save_list.rdata" = {
  rlist::list.save(Mylist, 'list.rdata')
},
"rlist::list.save_list.yaml" = {
  rlist::list.save(Mylist, 'list.yaml')
},
replications = 20,
columns = c("test", "replications", "elapsed",
            "relative", "user.self", "sys.self"))

dplyr::arrange(scores, elapsed)

                          test replications elapsed relative user.self sys.self
1        saveRDS_not_compress           20   82.20    1.000     23.68    23.83
2           save_not_compress           20   92.39    1.124     23.80    27.14
3 rlist::list.save_list.rdata           20  889.49   10.821    885.52     2.13
4   rlist::list.save_list.rds           20  912.86   11.105    909.09     1.95
5            saveRDS_compress           20  913.64   11.115    910.30     1.89
6               save_compress           20  919.03   11.180    915.03     2.13
7  rlist::list.save_list.yaml           20 3258.30   39.639   3155.67    97.20

System info: windows 10 - 64 bit, intel i7-7700 3.60Hz, 32GB RAM.

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Upvotes: 1

Related Questions