Getting accuracy for each category in a multi-label classification problem

Question

I need to calculate the accuracy for each category (NOT the overall accuracy) in a multi-label classification problem. It is easy to find the precision, recall and F-score for each category using classification_report from scikit-learn library. There are 13 categories distributed as follows:

                   precision    recall  f1-score   support

     Category 1       0.58      0.48      0.53       244
     Category 2       0.91      0.85      0.88       728
     Category 3       0.90      0.92      0.91      1319
     Category 4       0.70      0.55      0.62       533
     Category 5       1.00      0.10      0.18        20
     Category 6       0.94      0.84      0.89      2038
     Category 7       0.83      0.78      0.80      1930
     Category 8       0.85      0.44      0.58       113
     Category 9       0.88      0.87      0.87      1329
     Category 10      0.79      0.54      0.64        61
     Category 11      0.81      0.77      0.79       562
     Category 12      0.71      0.62      0.66       416
     Category 13      0.76      0.60      0.67       500

      micro avg       0.86      0.78      0.82      9793
      macro avg       0.82      0.64      0.69      9793
   weighted avg       0.85      0.78      0.81      9793
    samples avg       0.86      0.82      0.83      9793

I know that accuracy can be found as follows: Accuracy=(TP+TN)/(TP+TN+FP+FN) but finding TP and TN for this multi-lable classifcation problem was an issue for me.

There is a similar question to this one on stackoverflow Calculating accuracy from precision, recall, f1-score - scikit-learn but for binary-classification problems only.

Note: I have tried multilabel_confusion_matrix and confusion_matrix from sklearn.metrics to extract the confusion matrix, but both gave me the same following error: ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets

Any Ideas?

Federico Malerba · Accepted Answer

You can manually compute the per-class accuracy from the original arrays with the following code:

class_accuracies = []
for class_ in np.unique(y_true):
    class_acc = np.mean(y_pred[y_true == class_] == class_)
    class_acuracies.append(class_acc)

Getting accuracy for each category in a multi-label classification problem

Answers (2)

Related Questions