sovon
sovon

Reputation: 907

Why "swapping the argument does not change the score" in normalized_mutual_info_score?

I am trying to evaluate cluster quality by Normalized Mutual Information(NMI) by using scikit learn's normalized_mutual_info_score() function. I understand the mathematical theory of NMI but a bit confused about how this function work.

The arguments are two array containing the labels of two clustering(labels_pred) and classifications(labels_true). What i understand about this two array is that, the labels are ordered, by that I means for example if,

labels_pred=[0,0,1,1] then document number one and two are lebeled as zero and third and fourth are labeled as one. Now if label_true=[0,0,0,1] , that means the ground truth classification of document one, two and three are zero and the fourth is one. So the classifier misclassified the third document. Is my understanding correct?

Now, look at the documentation, Where labels_true = [0, 0, 0, 1, 1, 1] and labels_pred = [0, 0, 1, 1, 2, 2], so according to my understanding, the cluster algorithm predicted 3 documents(first, second and fourth) correctly. However they say,

One can permute 0 and 1 in the predicted labels

normalized_mutual_info_score are symmetric: swapping the argument does not change the score

So if labels_pred = [1, 1, 0, 0, 2, 2], then only one document is correctly labeled. And according to them , this swapping will not change the NMI. Why is that? What is wrong in my understanding ?

Thanks for your precious time for reading my problem. I will highly appreciate any kind of help, Thanks.

Upvotes: 0

Views: 185

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77474

You cannot (and the methods do not) expect that "1" in one clustering is the same as "1" in the other clustering.

Every label is compared with every other label. Thus, renaming labels does not affect the result.

In fact, it is common that labels even come from different domains. Once you step outside the "everything is a number" box, it is fairly common that labels are, e.g. text.

For example the famous iris data set. The classes are not labeled 0,1,2, but iris setosa, iris virginica, etc. Yet, if you run k-means, k-means will label the clusters 0,1,2 because it doesn't have names.

Therefore, any cluster evaluation measure must be able to match different labels (of, usually, compare every label with every other label, matching all pairs).

"swapping the arguments" however refers to something different: you can also swap labels_pred and labels_true, and the result does not change. There is no assumption on which of the arguments is "correct". These measures simply measure how similar the partitions are that you get from the labels.

Upvotes: 1

Related Questions