Reputation: 445
I have a group of coordinates plotted below. I would like to cluster the overlapping points (the ones circled in red) together, however, I would like all the other points that are not overlapping (the points not circled in red) to be ignored. I cannot use K-means clustering since that would cluster all of the points, including the ones I want to be ignored. I was wondering how I might go about this. Thanks
Desired Output:
Input:
Upvotes: 0
Views: 326
Reputation: 77454
There is not just k-means. You are missing 50 years of research if all you consider is k-means.
For example DBACAN has the concept of noise points that don't belong to any cluster.
In your case, however, you aren't actually looking for clustering.
Instead, you want to perform a similarity self-join. Because as far as I can tell. You want to match pairs of points. It a special kind of join. There is no standard syntax for this, but think of it as a SELECT a.p, b.p FROM data AS a JOIN data AS b WHERE distance(a.p, b.p) < threshold
.
Upvotes: 1