Reputation: 907
I am trying to evaluate cluster quality by Normalized Mutual Information(NMI) by using scikit learn's normalized_mutual_info_score() function. I understand the mathematical theory of NMI but a bit confused about how this function work.
The arguments are two array containing the labels of two clustering(labels_pred
) and classifications(labels_true
). What i understand about this two array is that, the labels are ordered, by that I means for example if,
labels_pred=[0,0,1,1]
then document number one
and two
are lebeled as zero
and third
and fourth
are labeled as one
. Now if label_true=[0,0,0,1]
, that means the ground truth classification of document one
, two
and three
are zero
and the fourth
is one
. So the classifier misclassified the third document. Is my understanding correct?
Now, look at the documentation, Where labels_true = [0, 0, 0, 1, 1, 1]
and labels_pred = [0, 0, 1, 1, 2, 2]
, so according to my understanding, the cluster algorithm predicted 3 documents(first, second and fourth) correctly. However they say,
One can permute 0 and 1 in the predicted labels
normalized_mutual_info_score are symmetric: swapping the argument does not change the score
So if labels_pred = [1, 1, 0, 0, 2, 2]
, then only one document is correctly labeled. And according to them , this swapping will not change the NMI. Why is that? What is wrong in my understanding ?
Thanks for your precious time for reading my problem. I will highly appreciate any kind of help, Thanks.
Upvotes: 0
Views: 185
Reputation: 77474
You cannot (and the methods do not) expect that "1" in one clustering is the same as "1" in the other clustering.
Every label is compared with every other label. Thus, renaming labels does not affect the result.
In fact, it is common that labels even come from different domains. Once you step outside the "everything is a number" box, it is fairly common that labels are, e.g. text.
For example the famous iris data set. The classes are not labeled 0,1,2, but iris setosa, iris virginica, etc. Yet, if you run k-means, k-means will label the clusters 0,1,2 because it doesn't have names.
Therefore, any cluster evaluation measure must be able to match different labels (of, usually, compare every label with every other label, matching all pairs).
"swapping the arguments" however refers to something different: you can also swap labels_pred and labels_true, and the result does not change. There is no assumption on which of the arguments is "correct". These measures simply measure how similar the partitions are that you get from the labels.
Upvotes: 1