Ali Ok
Ali Ok

Reputation: 23

Classification report in scikit learn

I want to classify faults and no-fault conditions for a device. Label A for fault and label B for no-fault.

scikit-learn gives me a report for classification matrix as :

        precision    recall   f1-score   support
A       0.82         0.18     0.30       2565
B       0.96         1.00     0.98       45100

Now which of A or B results should I use to specify the model operation?

Upvotes: 2

Views: 673

Answers (1)

Lukasz Tracewski
Lukasz Tracewski

Reputation: 11377

Introduction

There's no single score that can universally describe the model, all depends on what's your objective. In your case, you're dealing with fault detection, so you're interested in finding faults among much greater number of non-fault cases. Same logic applies to e.g. population and finding individuals carrying a pathogen.

In such cases, it's typically very important to have high recall (also known as sensitivity) for "fault" cases (or that e.g. you might be ill). In such screening it's typically fine to diagnose as "fault", something that actually works fine - that is your false positive. Why? Because the cost of missing a faulty part in an engine or a tumor is much greater than asking engineer or doctor to verify the case.

Solution

Assuming that this assumption (recall for faults is most important metric) holds in your case, then you should be looking at recall for Label A (faults). By these standards, your model is doing rather poorly: it finds only 18% of faults. Likely much stems from the fact that number of faults is ~20x smaller than non-faults, introducing heavy bias (that needs to be tackled).

I can think of number of scenarios where this score would not be actually bad. If you can detect 18% of all faults in engine (on top of other systems) and not introduce false alarms, then it can be really useful - you don't want too often fire alarm to the driver while everything's fine. At the same time, likely you don't want to use the same logic for e.g. cancer detection and tell patient "everything's OK", while there's a very high risk that the diagnosis is wrong.

Metrics

For the sake of completeness, I will explain the terms. Consider these definitions:

enter image description here

  • tp - true positive (real fault)
  • tn - true negative (it's not a fault)
  • fp - false positive (detected fault, while it's OK)
  • fn - false negative (detected OK, while it's a fault)

Here is one article that attempts to nicely explain what's precision, recall and F1.

Upvotes: 2

Related Questions