Reputation: 1551
I was under the impression that any call to a list by the index was a constant runtime (i.e. O(1) ), but it seems that this is not actually the case, as shown below. What causes this increase in runtime (looks like O(n) ), and more importantly, is there a way around it? I would like to have a list of data.frame's that has the constant runtime access of a hash set (and there will be no changes to the "key set" so a "perfect" hash is possible).
> l <- list()
> system.time(l[[300000000]] <- "test")
user system elapsed
1.86 0.36 2.21
> system.time(l[[299999999]] <- "test")
user system elapsed
4.42 0.73 5.15
> system.time(l[[1]] <- "test")
user system elapsed
4.34 0.44 4.77
> l <- list()
> system.time(l[[300000]] <- "test")
user system elapsed
0 0 0
> l <- list()
> system.time(l[[3000000]] <- "test")
user system elapsed
0.01 0.00 0.01
> l <- list()
> system.time(l[[30000000]] <- "test")
user system elapsed
0.32 0.03 0.36
Upvotes: 0
Views: 180
Reputation: 49448
The time you measured is spent building the list, not accessing it. See e.g. what the result here is:
l <- list()
l[[3]] <- "test"
l
#[[1]]
#NULL
#
#[[2]]
#NULL
#
#[[3]]
#[1] "test"
So this takes some time:
l = list()
system.time(l[[1e7]] <- "test")
#user system elapsed
#0.06 0.00 0.07
But this doesn't:
l = list()
l[[1e7]] = "boo" # this will build the list
system.time(l[[1e7]] <- "test") # now just access
#user system elapsed
# 0 0 0
Perhaps for your purposes you might want to have character keys instead - l[["402023402"]] = "some value"
.
Upvotes: 1