Azim
Azim

Reputation: 1724

Which class label is considered negative in sklearn.metrics.confusion_matrix?

I know that we can use a list to indicate the order:

tn, fp, fn, tp = confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0], labels=[0,1]).ravel()

but the meaning of the elements of the matrix depends on two assumptions:

  1. Whether rows or columns are considered as ACTUAL (or PREDICTED) labels.
  2. Whether 0 or 1 is assumed to be the POSITIVE (or NEGATIVE) class. and none of them are directly mentioned in the docstring.

This question has been already asked here, but I think here I am asking about the root of the confusion and not the confusion in its general term. The issue is not how to interpret the confusion-matrix, but how to set a specific class as positive or negative.

Upvotes: 2

Views: 3990

Answers (1)

Azim
Azim

Reputation: 1724

Short answer In binary classification, when using the argument labels ,

confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0], labels=[0,1]).ravel()

the class labels, 0, and 1, are considered as Negative and Positive, respectively. This is due to the order implied by the list, and not the alpha-numerical order.


Verification: Consider an imbalance class labels like this: (using imbalance class to make the distinction easier)

>>> y_true = [0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0]
>>> y_pred = [0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0]
>>> table = confusion_matrix(y_true, y_pred, labeels=[0,1]).reval()

this would give you a confusion table as follows:

>>> table
array([12,  1,  2,  1])

which corresponds to:

              Actual
        |   1   |   0  |
     ___________________
pred  1 |  TP=1 | FP=1 |
      0 |  FN=2 | TN=12|

where FN=2 means that there were 2 cases where the model predicted the sample to be negative (i.e., 0) but the actual label was positive (i.e., 1), hence False Negative equals 2.

Similarly for TN=12, in 12 cases the model correctly predicted the negative class (0), hence True Negative equals 12.

This way everything adds up assuming that sklearn considers the first label (in labels=[0,1] as the negative class. Therefore, here, 0, the first label, represents the negative class.

Upvotes: 5

Related Questions