Reputation: 161

how to fast compute distance between high dimension vectors

assume there are three group of high dimension vectors:

{a_1, a_2, ..., a_N},

{b_1, b_2, ... , b_N},

{c_1, c_2, ..., c_N}.

each of my vector can be represented as: x = a_i + b_j + c_k, where 1 <=i, j, k <= N. then the vector is encoded as (i, j, k) wich is then can be decoded as x = a_i + b_j + c_k.

my question is, if there are two vector: x = (i_1, j_1, k_1), y = (i_2, j_2, k_2), is there a method to compute the euclidian distance of these two vector without decode x and y.

Upvotes: 1

Answers (3)

Valentas

Reputation: 2245

If you are happy with an approximate result, you could project your high dimension basis vectors using a random projection into a small dimensional space. Johnson-Lindenstrauss lemma says that you can reduce your dimension to O(log N), so that distances remain approximately the same with high probability.

Upvotes: 0

Olivier Verdier

Reputation: 49146

Let's assume you have only two groups. You are trying to compute the scalar product

(a_i1 + b_j1, a_i2 + b_j2)
= (a_i1,a_i2) + (b_j1,b_j2) + (a_i1,b_j2) + (a_i2,b_j1) # <- elementary scalar products

So if you know the necessary elementary scalar products between the elements of your vectors a_i, b_j, c_k, then, you do not need to "decode" x and y and can compute the scalar product directly.

Note that this is exactly what happens when you compute an ordinary euclidian distance on a non orthogonal basis.

Upvotes: 1

duffymo

Reputation: 308763

Square root of the sum of squares of the differences between components. There's no other way to do it.

You should scale the values to guard against overflow/underflow issues. Search for the max difference and divide all the components by it before squaring, summing, and taking the square root.

Upvotes: 3

how to fast compute distance between high dimension vectors

Answers (3)

Related Questions