Reputation: 614
I am trying to create an ordered "structure" in R so that either the first (or last) element is always the smallest. Currently, I am going about it in the following way:
I have a sorted vector of ascending numerical values and I input new values in it so that the ordering is kept. Then when I want the next element (meaning the smallest one), I just take the first element and delete it from the vector.
# Input the new element
point <- Position(function(v) v < ins, ordVec, right = TRUE)
if (is.na(point)) {point <- 0}
ordVec <- append(ordVec, ins, after = point)
# Get the smallest element
value <- ordVec[1]
ordVec <- ordVec[-1]
where ordVec
is the ordered vector and ins
is the value I want to input.
E.g., I have ordVec <- c(1, 2, 3, 4, 5)
and ins <- 2.5
, and after inputing ins
I get c(1, 2, 2.5, 3, 4, 5)
. Then, the value
is 1
and the resulting vector is c(2, 2.5, 3, 4, 5)
.
I am doing this as a part of a while
loop and when the ordVec
is long, it takes quite a long time.
Does there exist a faster way, be it using something other than the described procedure to get an ordered "structure", or modifying the described procedure and the element retrieval?
Upvotes: 0
Views: 73
Reputation: 46938
I think the problem lies with the Position function. Since your vector is already sorted, you don't need this. If you look at the code, it actually loops through the vector, meaning it will bum along if the vector is long:
Position
function (f, x, right = FALSE, nomatch = NA_integer_)
{
ind <- if (right)
rev(seq_along(x))
else seq_along(x)
for (i in ind) if (f(x[[i]]))
return(i)
nomatch
}
So let's try a few solutions:
f_position = function(ordVec,ins){
point <- Position(function(v) v < ins, ordVec, right = TRUE)
ordVec <- append(ordVec, ins, after = point)
return(c(ordVec[1],ordVec[-1]))
}
f_which = function(ordVec,ins){
point <- max(which(ordVec < ins))
ordVec <- append(ordVec, ins, after = point)
return(c(ordVec[1],ordVec[-1]))
}
f_interval = function(ordVec,ins){
point <- findInterval(ins,ordVec)
ordVec <- append(ordVec, ins, after = point)
return(c(ordVec[1],ordVec[-1]))
}
And see their timing, I use a presorted vector, and 1 insertion value:
set.seed(111)
VEC = sort(runif(10000))
INS = 0.5
microbenchmark(f_position(VEC,INS),times=1000,unit="ms")
Unit: milliseconds
expr min lq mean median uq max
f_position(VEC, INS) 2.525094 2.667763 3.040159 2.769987 2.923979 29.5144
neval
1000
microbenchmark(f_which(VEC,INS),times=1000,unit="ms")
Unit: milliseconds
expr min lq mean median uq max neval
f_which(VEC, INS) 0.148981 0.182694 0.3350992 0.197149 0.396787 28.04375 1000
microbenchmark(f_interval(VEC,INS),times=1000,unit="ms")
Unit: milliseconds
expr min lq mean median uq max
f_interval(VEC, INS) 0.121805 0.15208 0.2851541 0.16236 0.201172 30.68901
neval
1000
I think you are better off using max(which()) or findInterval(), you just need to figure out how to initialize but should not be a problem.
Upvotes: 2