Reputation: 15
X <- matrix(c(10,9,
1,4,
9,1,
5,10,
3,10,
3,6), ncol=2, byrow = TRUE)
y <-c("a","b","a",'c','c','b')
X_new <- matrix(c(6,4,
9,2,
7,2), ncol=2, byrow=TRUE)
knn<- function(train_x, train_y, test_x){
train_x <- as.matrix(train_x)
test_x <- as.matrix(test_x)
d1 <- dim(train_x)
n1 <- d1[1]
p2 <- d1[2]
d2 <- dim(test_x)
n2 <- d2[1]
p2 <- d2[2]
pred_y <- rep(0,n2)
for (i in 1:n2) {
X_temp = train_x - matrix(test_x[i,], nrow=nrow(train_x),ncol=ncol(train_x), byrow = TRUE)
euc_dist =sqrt(rowSums(X_temp^2))
print(euc_dist)
print(order(euc_dist))
pred_y[i] <- train_y[which.min(euc_dist)]
}
return (pred_y)
}
knn(X,y,X_new)
this prints like the below
[1] 6.403124 5.000000 4.242641 6.082763 6.708204 3.605551
[1] 6 3 2 4 1 5
[1] 7.071068 8.246211 1.000000 8.944272 10.000000 7.211103
[1] 3 1 6 2 4 5
[1] 7.615773 6.324555 2.236068 8.246211 8.944272 5.656854
[1] 3 6 2 1 4 5
[1] "b" "a" "a"
I think the first order() should print "5 3 2 4 6 1"
"6 3 2 4 1 5" isn't what I expected. there's something I miss??
Upvotes: 0
Views: 25
Reputation: 683
What you are seeing is the order in which the values should appear to get an ordered vector. This is more easily shown than explained:
out <- c(6.403124, 5.000000, 4.242641, 6.082763, 6.708204, 3.605551)
order(out)
# [1] 6 3 2 4 1 5
So like you say, it looks odd, because you expect c(2, 4, 5, 3, 1, 6)
(or c(5, 3, 2, 4, 6, 1)
in ascending order). But one common way of using order is x[order(x)]
to get the ordered vector, and if you do that, you get:
out[order(out)]
# [1] 3.605551 4.242641 5.000000 6.082763 6.403124 6.708204
or
out[order(out, decreasing = TRUE)]
# [1] 6.708204 6.403124 6.082763 5.000000 4.242641 3.605551
which is pretty useful.
If you want to know which place in the hierarchy the values are, you could go with rank
:
rank(out)
# [1] 5 3 2 4 6 1
or
rank(desc(out))
# [1] 2 4 5 3 1 6
Upvotes: 1