Reputation: 37
When I started to understand this algorithm, I didn't quite understand how it should work. I have a dataset, and I have new data that needs to be classified. dataset:
2 2 a
3 5 a
1 8 b
3 16 b
4 12 a
5 20 a
And a new data:
1 2
now I need to classify the new set as "a" or "b".
I can calculate distance for each set.sqrt((aNew-Ai)^2+(bNew-Bi)^2)
for each dataset.
with distanse i have that data:
a b dist class
new 1 2 ? ?
old 2 2 1,0 a
old 3 5 5,8 a
old 1 8 8,1 b
old 3 16 16,3 b
old 4 12 12,6 a
old 5 20 20,6 a
And for ex. K equal 6. How should i classify my new data?
Upvotes: 2
Views: 708
Reputation:
You need to find the distance between the new data point and all the points in your dataset.
For implementation in Java refer here
Upvotes: 1
Reputation: 3148
In your example it's a
, because is the most common value in the k (6) nearest neighbour list.
But K should be an uneven number to prevent ambiguous classification.
Upvotes: 1