Pavol
Pavol

Reputation: 67

ELKI KNNDistancesSampler

Does anybody know what does the KNNDistancesSampler calculate in ELKI? I can see the java code for the function : https://github.com/elki-project/elki/blob/master/elki/src/main/java/de/lmu/ifi/dbs/elki/algorithm/KNNDistancesSampler.java, but I am really bad at java - I can see it should get the distance of its neighbors by getKNNDistance()... Is it returning average distance(Euclidean by default) of the k-nearest neighbors of each point? I know it should be used for epsilon estimation of dbscan etc.etc., but I'd also like to know what it is doing... Thank you

Upvotes: 0

Views: 71

Answers (1)

Erich Schubert
Erich Schubert

Reputation: 8715

References for this are given in the class documentation:

Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD '96)

Erich Schubert, Jörg Sander, Martin Ester, Hans-Peter Kriegel, Xiaowei Xu
DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN
ACM Trans. Database Systems (TODS)

The class is returning a sample, not just the average, of the kNN distances to help choosing the epsilon parameter using the "elbow" method on that plot. It does not automate choosing this - it only produces the plot.

Upvotes: 1

Related Questions