Jhon
Jhon

Reputation: 37

Knn algorithm how it works

When I started to understand this algorithm, I didn't quite understand how it should work. I have a dataset, and I have new data that needs to be classified. dataset:

2   2   a
3   5   a
1   8   b
3   16  b
4   12  a
5   20  a

And a new data:

1   2

now I need to classify the new set as "a" or "b". I can calculate distance for each set.sqrt((aNew-Ai)^2+(bNew-Bi)^2) for each dataset. with distanse i have that data:

    a   b   dist    class
new 1   2   ?       ?
old 2   2   1,0     a
old 3   5   5,8     a
old 1   8   8,1     b
old 3   16  16,3    b
old 4   12  12,6    a
old 5   20  20,6    a

And for ex. K equal 6. How should i classify my new data?

Upvotes: 2

Views: 708

Answers (2)

user9477964
user9477964

Reputation:

You need to find the distance between the new data point and all the points in your dataset.

  1. Arrange these distances in ascending order.
  2. Pick up the first K number of distances from the list.
  3. Get the class from the picked distance.
  4. Now check which class has the largest repetitions or votes.

For implementation in Java refer here

Upvotes: 1

Markus
Markus

Reputation: 3148

In your example it's a, because is the most common value in the k (6) nearest neighbour list.
But K should be an uneven number to prevent ambiguous classification.

Upvotes: 1

Related Questions