Reputation: 1235
I am trying to calculate euclidean distance for Iris dataset. Basically I want to calculate distance between each pair of objects. I have a code working as follows:
for (i in 1:iris_column){
for (j in 1:iris_row) {
m[i,j] <- sqrt((iris[i,1]-iris[j,1])^2+
(iris[i,2]-iris[j,2])^2+
(iris[i,3]-iris[j,3])^2+
(iris[i,4]-iris[j,4])^2)
}
}
Although this works, I don't think this is a good way to wring R-style code. I know that R has built-in function to calculate Euclidean function. Without using built-in function, I want to know better code (faster and fewer lines) which could do the same as my code.
Upvotes: 1
Views: 3036
Reputation: 314
Or stay with the standard package stats
:
m <- dist(iris[,1:4]))
This gives you an object of the class dist
, which stores the lower triangle (all you need) compactly. You can get an ordinary full symmetric matrix if, e.g., you like to look at some elements:
> as.matrix(m)[1:5,1:5]
1 2 3 4 5
1 0.0000000 0.5385165 0.509902 0.6480741 0.1414214
2 0.5385165 0.0000000 0.300000 0.3316625 0.6082763
3 0.5099020 0.3000000 0.000000 0.2449490 0.5099020
4 0.6480741 0.3316625 0.244949 0.0000000 0.6480741
5 0.1414214 0.6082763 0.509902 0.6480741 0.0000000
Upvotes: 0
Reputation: 545578
The part inside the loop can be written as
m[i, j] = sqrt(sum((iris[i, ] - iris[j, ]) ^ 2))
I’d keep the nested loop, nothing wrong with that here.
Upvotes: 3