Reputation: 67
It's my understanding that confusion matrices should show the TRUE classes in the columns and the PREDICTED classes in the rows. Therefore the sum of the columns should be equal to the value_counts() of the TRUE series.
I have provided an example here:
from sklearn.metrics import confusion_matrix
pred = [0, 0, 0, 1]
true = [1, 1, 1, 1]
confusion_matrix(true, pred)
Why does this give me the following output? Surely it should be the transpose of that?
array([[0, 0],
[3, 1]], dtype=int64)
Upvotes: 1
Views: 2347
Reputation: 1
It is possible to do as you wish using sklearn, only change the code below appropriately
from sklearn.metrics import ConfusionMatrixDisplay
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1,figsize=(7,4))
ConfusionMatrixDisplay(confusion_matrix(predict,y_test,labels=[1,0]),
display_labels=[1,0]).plot(values_format=".0f",ax=ax)
ax.set_xlabel("True Label")
ax.set_ylabel("Predicted Label")
plt.show()
Upvotes: 0
Reputation: 11
The confusion probably arises because sklearn
follows a different convention for axes of confusion matrix than the wikipedia article. So, to answer your question: It gives you the output in that specific format because sklearn
expects you to read it in a specific way.
Here are the two different ways of writing confusion matrix:
sklearn
Upvotes: 1
Reputation: 33147
scikit-learn's confusion matrix
follows a specific order and structure.
Upvotes: 0