Reputation: 299
Suppose that I have two vectors.
x1 = c(-1, 2, 3)
x2 = c(4, 0, -3)
To calculate the Euclidean distance, I used three different ways
1- The built function norm
s = cbind(x1, x2)
norm(s, "2")
#[1] 5.797896
2- Hand calculation
sqrt(sum(x2 - x1) ^ 2)
#[1] 8.062258
3- custom function
lpnorm <- function(x, p){
n <- sum(abs(x) ^ p) ^ (1 / p)
return(n)
}
lpnorm(s, 2)
#[1] 6.244998
Why I got different results?
If I am wrong, how to solve this problem?
Upvotes: 2
Views: 1432
Reputation: 73325
You need s = x2 - x1
.
norm(s, "2")
#[1] 8.062258
sqrt(sum(s ^ 2)) ## or: sqrt(c(crossprod(s)))
#[1] 8.062258
lpnorm(s, 2)
#[1] 8.062258
If you define s = cbind(x1, x2)
, none of the options you listed is going to compute the Euclidean distance between x1
and x2
, but we can still get them output the same value. In this case they the L2 norm of the vector c(x1, x2)
.
norm(s, "F")
#[1] 6.244998
sqrt(sum(s ^ 2))
#[1] 6.244998
lpnorm(s, 2)
#[1] 6.244998
Finally, norm
is not a common way for computing distance. It is really for matrix norm. When you do norm(cbind(x1, x2), "2")
, it computes the L2 matrix norm which is the largest singular value of matrix cbind(x1, x2)
.
So my problem is with defining
s
. Ok, what if I have more than three vectors?
In that case you want pairwise Euclidean matrix. See function ?dist
.
I have the train sets (containing three or more rows) and one test set (one row). So, I would like to calculate the Euclidean distance or may be other distances. This is the reason why I want to make sure about the distance calculation.
You want the distance between one vector and each of many others, and the result is a vector?
set.seed(0)
X_train <- matrix(runif(10), 5, 2)
x_test <- runif(2)
S <- t(X_train) - x_test
apply(S, 2, norm, "2") ## don't try other types than "2"
#[1] 0.8349220 0.7217628 0.8012416 0.6841445 0.9462961
apply(S, 2, lpnorm, 2)
#[1] 0.8349220 0.7217628 0.8012416 0.6841445 0.9462961
sqrt(colSums(S ^ 2)) ## only for L2-norm
#[1] 0.8349220 0.7217628 0.8012416 0.6841445 0.9462961
I would stress again that norm
would fail on a vector, unless type = "2"
. ?norm
clearly says that this function is intended for matrix. What norm
does is very different from your self-defined lpnorm
function. lpnorm
is for a vector norm, norm
is for a matrix norm. Even "L2" means differently for a matrix and a vector.
Upvotes: 3