Reputation: 4493

How can I get the confusion matrix used to calculate the metrics for NER models?

spaCy provides Precision, Recall, F1 scores in the meta.json file when it writes out the trained NER model. Also these values are available when running the evaluation command python -m spacy evaluate. However is it possible to get the counts for TP, FP, FN used to calculate these values?

Furthermore is it possible to output the actual text / tokens which resulted in a False Positive or False Negative?

Upvotes: 3

Answers (1)

LBoss

Reputation: 536

I think you can get the counts for TP, FP, FN when evaluating all entity types using

scorer = nlp.evaluate(testset)
TP = scorer.ner.tp
FP = scorer.ner.fp
FN = scorer.ner.fn

and when evaluating per entity type using

scorer = nlp.evaluate(testset)
for ent_type, scorer_ent_type in scorer.ner_per_ents.items():
    TP = scorer_ent_type.tp
    FP = scorer_ent_type.fp
    FN = scorer_ent_type.fn
    print('Ent_type:', ent_type, 'TP:', TP, 'FP:', FP, 'FN:', FN)

When training and evaluating your spacy NER model, as I understand, the scores for all entities are calculated in this line in the spacy code. The scores per entity are calculated in this line. In both cases, the score_set function is called. It updates the TP, FP, and FN in the scorer. If you set breakpoints at those lines and debug, you can look into the variables doc, cand_ents, and gold_ents, and have a look at the FP and FN with

print(doc)
print(cand_ents-gold_ents) #FP
print(gold_ents-cand_ents) #FN

Late answer but I hope it helps.

Upvotes: 2

How can I get the confusion matrix used to calculate the metrics for NER models?

Answers (1)

Related Questions