Reputation: 2572
In R
, I currently have a 7 GB list object I need to save. Is there a way I can benchmark and approximate the time it takes to save the 7 GB object using using the saveRDS()
with its defaults? (such as file compression) I have tried to approximate it but am unsure how to do so. I am on a computer with 16 cores (not sure if that makes a difference) and have 30 GB of RAM with a fast 3+ GHZ processor.
Thanks.
Upvotes: 1
Views: 74
Reputation: 4873
I'm not sure if that's what you meant but you can use the 'rbenchmark' package (here a nice blog post on different ways to benchmark, including rbenchmark
) .
I did some benchmarking with a 1.1GB list
object.
library(rbenchmark)
Mylist <- list(a = data.frame(replicate(100000, sample(0:1, 1000, rep = TRUE))),
b = data.frame(replicate(100000, sample(0:1, 1000, rep = TRUE))),
c = data.frame(replicate(100000, sample(0:1, 1000, rep = TRUE))))
print(object.size(Mylist), units = "auto")
1.1 Gb
scores <- rbenchmark::benchmark("saveRDS_compress" = {
saveRDS(Mylist, file = tempfile("mylist.rds"), compress = TRUE)
},
"saveRDS_not_compress" = {
saveRDS(Mylist, file = tempfile("mylist.rds"), compress = FALSE)
},
"save_compress" = {
save(Mylist, file = tempfile("mylist.rds"), compress = TRUE)
},
"save_not_compress" = {
save(Mylist, file = tempfile("mylist.rds"), compress = FALSE)
},
"rlist::list.save_list.rds" = {
rlist::list.save(Mylist, 'list.rds')
},
"rlist::list.save_list.rdata" = {
rlist::list.save(Mylist, 'list.rdata')
},
"rlist::list.save_list.yaml" = {
rlist::list.save(Mylist, 'list.yaml')
},
replications = 20,
columns = c("test", "replications", "elapsed",
"relative", "user.self", "sys.self"))
dplyr::arrange(scores, elapsed)
test replications elapsed relative user.self sys.self
1 saveRDS_not_compress 20 82.20 1.000 23.68 23.83
2 save_not_compress 20 92.39 1.124 23.80 27.14
3 rlist::list.save_list.rdata 20 889.49 10.821 885.52 2.13
4 rlist::list.save_list.rds 20 912.86 11.105 909.09 1.95
5 saveRDS_compress 20 913.64 11.115 910.30 1.89
6 save_compress 20 919.03 11.180 915.03 2.13
7 rlist::list.save_list.yaml 20 3258.30 39.639 3155.67 97.20
System info: windows 10 - 64 bit, intel i7-7700 3.60Hz, 32GB RAM.
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Upvotes: 1