Mike T
Mike T

Reputation: 43642

R: find nearest index

I have two vectors with a few thousand points, but generalized here:

A <- c(10, 20, 30, 40, 50)
b <- c(13, 17, 20)

How can I get the indicies of A that are nearest to b? The expected outcome would be c(1, 2, 2).

I know that findInterval can only find the first occurrence, and not the nearest, and I'm aware that which.min(abs(b[2] - A)) is getting warmer, but I can't figure out how to vectorize it to work with long vectors of both A and b.

Upvotes: 7

Views: 5184

Answers (3)

jdobres
jdobres

Reputation: 11957

Here's a solution that uses R's often overlooked outer function. Not sure if it'll perform better, but it does avoid sapply.

A <- c(10, 20, 30, 40, 50)
b <- c(13, 17, 20)

dist <- abs(outer(A, b, '-'))
result <- apply(dist, 2, which.min)

# [1] 1 2 2

Upvotes: 0

Robert Sugar
Robert Sugar

Reputation: 151

FindInterval gets you very close. You just have to pick between the offset it returns and the next one:

#returns the nearest occurence of x in vec
nearest.vec <- function(x, vec)
{
    smallCandidate <- findInterval(x, vec, all.inside=TRUE)
    largeCandidate <- smallCandidate + 1
    #nudge is TRUE if large candidate is nearer, FALSE otherwise
    nudge <- 2 * x > vec[smallCandidate] + vec[largeCandidate]
    return(smallCandidate + nudge)
}

nearest.vec(b,A)

returns (1,2,2), and should comparable to FindInterval in performance.

Upvotes: 11

Sacha Epskamp
Sacha Epskamp

Reputation: 47551

You can just put your code in a sapply. I think this has the same speed as a for loop so isn't technically vectorized though:

sapply(b,function(x)which.min(abs(x - A)))

Upvotes: 12

Related Questions