jpd527
jpd527

Reputation: 1551

R list access time

I was under the impression that any call to a list by the index was a constant runtime (i.e. O(1) ), but it seems that this is not actually the case, as shown below. What causes this increase in runtime (looks like O(n) ), and more importantly, is there a way around it? I would like to have a list of data.frame's that has the constant runtime access of a hash set (and there will be no changes to the "key set" so a "perfect" hash is possible).

> l <- list()
> system.time(l[[300000000]] <- "test")
   user  system elapsed 
   1.86    0.36    2.21 
> system.time(l[[299999999]] <- "test")
   user  system elapsed 
   4.42    0.73    5.15 
> system.time(l[[1]] <- "test")
   user  system elapsed 
   4.34    0.44    4.77 
> l <- list()
> system.time(l[[300000]] <- "test")
   user  system elapsed 
      0       0       0 
> l <- list()
> system.time(l[[3000000]] <- "test")
   user  system elapsed 
   0.01    0.00    0.01 
> l <- list()
> system.time(l[[30000000]] <- "test")
   user  system elapsed 
   0.32    0.03    0.36 

Upvotes: 0

Views: 180

Answers (1)

eddi
eddi

Reputation: 49448

The time you measured is spent building the list, not accessing it. See e.g. what the result here is:

l <- list()
l[[3]] <- "test"
l
#[[1]]
#NULL
#
#[[2]]
#NULL
#
#[[3]]
#[1] "test"

So this takes some time:

l = list()
system.time(l[[1e7]] <- "test")
#user  system elapsed 
#0.06    0.00    0.07 

But this doesn't:

l = list()
l[[1e7]] = "boo"    # this will build the list
system.time(l[[1e7]] <- "test")   # now just access
#user  system elapsed 
#   0       0       0 

Perhaps for your purposes you might want to have character keys instead - l[["402023402"]] = "some value".

Upvotes: 1

Related Questions