Reputation: 501
I have a Shiny app that requires to load a .RData file first. The .RData file has only one big file that has 28 millions rows with 25 variables. The Shiny page takes more than 2 mins to show up because the file is big. Our acceptable loading time is within 30s. Is anyone have any suggestions of how to speed up the loading?
I did try to load the data by fread function (from data.table pkg) but it still takes 2mins+ to load. I guess load(.RData) is still faster than fread(.csv)?
Thank you!
Upvotes: 3
Views: 4318
Reputation: 13932
Don't use compression - if you have fast disk and the variables are numerical then using uncompressed RDS is much faster than compressed:
> l = lapply(1:25, function(o) rnorm(28e6))
> names(l) = paste0("V",1:25)
> attr(l,"row.names") = .set_row_names(length(l[[1]]))
> class(l) = "data.frame"
> saveRDS(l, file="data.rds", compress=FALSE)
(new session)
> system.time(d<-readRDS("data.rds"))
user system elapsed
6.474 2.091 8.576
> dim(d)
[1] 28000000 25
That said, this seems like a good use case for Rserve where you can pre-load data so when the user connects the data is already loaded and shared by all sessions (assuming you're not running a Windows server).
Upvotes: 2