machinery
machinery

Reputation: 6290

How to calculate normalized euclidean distance on two vectors?

Let's say I have the following two vectors:

x = [(10-1).*rand(7,1) + 1; randi(10,1,1)];
y = [(10-1).*rand(7,1) + 1; randi(10,1,1)];

The first seven elements are continuous values in the range [1,10]. The last element is an integer in the range [1,10].

Now I would like to compute the euclidean distance between x and y. I think the integer element is a problem because all other elements can get very close but the integer element has always spacings of ones. So there is a bias towards the integer element.

How can I calculate something like a normalized euclidean distance on it?

Upvotes: 3

Views: 25069

Answers (3)

Cohensius
Cohensius

Reputation: 545

From Euclidean Distance - raw, normalized and double‐scaled coefficients

SYSTAT, Primer 5, and SPSS provide Normalization options for the data so as to permit an investigator to compute a distance coefficient which is essentially “scale free”. Systat 10.2’s normalised Euclidean distance produces its “normalisation” by dividing each squared discrepancy between attributes or persons by the total number of squared discrepancies (or sample size).

normalised Euclidean distance

Frankly, I can see little point in this standardization – as the final coefficient still remains scale‐sensitive. That is, it is impossible to know whether the value indicates high or low dissimilarity from the coefficient value alone

Upvotes: 1

Chris
Chris

Reputation: 470

I would rather normalise x and y before calculating the distance and then vanilla Euclidean would suffice.

In your example

x_norm = (x -1) / 9;          % normalised x
y_norm = (y -1) / 9;          % normalised y
dist = norm(x_norm - y_norm); % Euclidean distance between normalised x, y

However, I am not sure about whether having an integer element contributes to some sort of bias but we have already gotten kind of off-topic for stack overflow :)

Upvotes: 3

ibezito
ibezito

Reputation: 5822

According to Wolfram Alpha, and the following answer from cross validated, the normalized Eucledean distance is defined by:

enter image description here

You can calculate it with MATLAB by using:

0.5*(std(x-y)^2) / (std(x)^2+std(y)^2)

Alternatively, you can use:

0.5*((norm((x-mean(x))-(y-mean(y)))^2)/(norm(x-mean(x))^2+norm(y-mean(y))^2))

Upvotes: 10

Related Questions