Reputation: 135
I know that the R package bigmemory works great in dealing with large matrices and data frames. However, I was wondering if there is any package or any ways to efficiently work with large list.
Specifically, I created a list with its elements being vectors. I have a for loop and during each iteration, multiple values were appended to a selected element in that list (a vector). At first, it runs fast, but when the iteration is over maybe 10000, it slows down gradually (one iteration takes about a second). I'm going to go through about 70000 to 80000 iterations, and the list would be so large after that.
So I was just wondering if there is something like big.list as the big.matrix in the bigmemory package that could speed up this whole process.
Thanks!
Upvotes: 3
Views: 1299
Reputation: 343
DSL
package might help. The DList
object works like a drop in replacement for R's list. Futher, it provides a distributed list like facility too.
Upvotes: 1
Reputation: 42283
I'm not really sure if this a helpful answer, but you can interactively work with lists on disk using the filehash
package.
For example here's some code that makes a disk database, assigns a preallocated empty list to the database, then runs a function (getting the current time) that fills the list in the database.
# how many items in the list?
n <- 100000
# setup database on disk
dbCreate("testDB")
db <- dbInit("testDB")
# preallocate vector in database
db$time <- vector("list", length = n)
# run function using disk object
for(i in 1:n) db$time[[i]] <- Sys.time()
There is hardly any use of RAM during this process, however it is VERY slow (two orders of magnitude slower than doing it in RAM on some of my tests) due to constant disk I/O. So I'm not sure that this method is a good answer to the question of how you can speed up working on big objects.
Upvotes: 3