Reputation: 3003
Given two sorted vectors, how can you get the index of the closest values from one onto the other.
For example, given:
a = 1:20
b = seq(from=1, to=20, by=5)
how can I efficiently get the vector
c = (1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4)
which, for each value in a
, provides the index of the largest value in b
that is less than or equal to it. But the solution needs to work for unpredictable (though sorted) contents of a
and b
, and needs to be fast when a
and b
are large.
Upvotes: 2
Views: 102
Reputation: 388862
We can use cut
as.integer(cut(a, breaks = unique(c(b-1, Inf)), labels = seq_along(b)))
#[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4
Upvotes: 1
Reputation: 6234
You can use findInterval
, which constructs a sequence of intervals given by breakpoints in b
and returns the interval indices in which the elements of a
are located (see also ?findInterval
for additional arguments, such as behavior at interval boundaries).
a = 1:20
b = seq(from = 1, to = 20, by = 5)
findInterval(a, b)
#> [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4
Upvotes: 4