Reputation: 11
I have a dataframe not dissilmilar to this:
X Y Z Point
1 1 1 1
2 1 1 2
1 2 1 3
1 1 10 4
I am trying to determine the distances between these different points and have calculated the Euclidean distances between each using the stats::dist() function. The distances calculated are as follows:
#Calculating distances
df <- dist(data, method = "euclidean")
#Output
df
1 2 3
2 1.414214
3 2.236068 1.732051
4 9.486833 9.273618 9.110434
However, comparing points 1 to either 2 or 3 should give distances of 1, and comparing 1 to 4 should give a distance of 9. I am unsure whether there is some form of normalisation or one axis is weighted more than another? I would appreciate help in finding the distances of these coordinates so I can scale it up to a much larger dataset!
Upvotes: 1
Views: 553
Reputation: 39707
You are also including the column Point
when calculating the distance. Subset to the coordinates in data e.g. with data[,1:3]
or data[,-4]
.
dist(data[,1:3], method = "euclidean")
# 1 2 3
#2 1.000000
#3 1.000000 1.414214
#4 9.000000 9.055385 9.055385
Upvotes: 2