Michael
Michael

Reputation: 35

How to interprete the Python Clustering scores?

I try to use Agglomerative Clustering to cluster some Data but i dont know which number of clusters are perfect. Here are my results:Graph shows lot of Measuring Values in percentage on y axis and the number of Clusters on x axis

The Database consists out of 65 Classes to be recognize. Gini Value=0.265.

  1. What should be choosen for number of clusters? Maybe the same as number of classes?
  2. What means the intersection point of completeness and homogeneity and v measure?
  3. What means the maximum in adjusted mutual info score?

Upvotes: 0

Views: 279

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77474

  1. Don't use these measures for choosing k. Because they compare to the known solution. If you have a known solution, why choose an approximation instead?

  2. Probably just a coincidence. But you may want to study the equations, maybe they do agree at this point.

  3. For AMI, NMI, ARI, etc. the maximum is the k with the largest agreement with your existing labeled solution.

Upvotes: 1

Related Questions