Reputation: 1
I was infront of a clustering problem where I did not know and could not determine the number of clusters beforehand. So my solution was basically:
for i in range(limit): do clustering with i clusters compute Calinski Carabsz index
Take configuration with the biggest index. So the natural question that follows is : given a distribution of points that has to be clustered and a quality index (e.g. C C, Dunn, Davies-Bouldin), do they admit a unique extrema or are there multiple ?
Upvotes: 0
Views: 38
Reputation: 1116
First, it depends how close you look. Most likely there will be a unique extreme, but others may exist that are close. Second, any criterion that you choose will favour a certain granularity range. It is easy to construct data instances that admit different clusterings that are nested, say one that is fine-grained and another that is coarse-grained (and optionally more in-between). Any particular criterion will favour its own sweet spot in this landscape.
Upvotes: 0