Reputation: 8291
I am trying to implement k-nearest neighbor algorithm using python. I ended up with the following code. However, I am struggling with finding the index of the items that are the nearest neighbors. The following function will return the distance matrix. However I need to get the indices of these neighbors in the features_train
(the input matrix to the algorithm).
def find_kNN(k, feature_matrix, query_house):
alldistances = np.sort(compute_distances(feature_matrix, query_house))
dist2kNN = alldistances[0:k+1]
for i in range(k,len(feature_matrix)):
dist = alldistances[i]
j = 0
#if there is closer neighbor
if dist < dist2kNN[k]:
#insert this new neighbor
for d in range(0, k):
if dist > dist2kNN[d]:
j = d + 1
dist2kNN = np.insert(dist2kNN, j, dist)
dist2kNN = dist2kNN[0: len(dist2kNN) - 1]
return dist2kNN
print find_kNN(4, features_train, features_test[2])
Output is:
[ 0.0028605 0.00322584 0.00350216 0.00359315 0.00391858]
Can someone help me to identify these nearest items in the features_train
?
Upvotes: 0
Views: 1717
Reputation: 5929
I will suggest to use the python library sklearn
that has a KNeighborsClassifier
from which, once fitted, you can retrieve the nearest neighbors you are looking for :
Try this out:
# Import
from sklearn.neighbors import KNeighborsClassifier
# Instanciate your classifier
neigh = KNeighborsClassifier(n_neighbors=4) #k=4 or whatever you want
# Fit your classifier
neigh.fit(X, y) # Where X is your training set and y is the training_output
# Get the neighbors
neigh.kneighbors(X_test, return_distance=False) # Where X_test is the sample or array of samples from which you want to get the k-nearest neighbors
Upvotes: 1