W.P. McNeill
W.P. McNeill

Reputation: 17066

Why does my custom spaCy entity type get detected?

I am writing a spaCy program for which I want to define a custom named entity tag. Following the example here, I add a label called MY_NEW_LABEL to the pipeline.

import spacy

nlp = spacy.load("en_core_web_lg")

ner = nlp.get_pipe("ner")
new_label = "MY_NEW_LABEL"
ner.add_label(new_label)

documents_path = "my_document.txt"
document = nlp(open(documents_path).read())
print([e for e in document.ents if e.label_ == new_label])

When I run the above program it prints out a list of entities labeled with MY_NEW_LABEL. I don't see how this is possible because I never do anything with the label.

Clearly I'm misunderstanding how to work with custom entity tags, but I can't figure out why this would be happening from the documentation. Can anyone tell me why my program doesn't print out an empty list?

Upvotes: 0

Views: 313

Answers (1)

W.P. McNeill
W.P. McNeill

Reputation: 17066

This is unexpected behavior. I opened it as spaCy issue 1697: Custom Entity Labels Are Erroneously Detected.

Upvotes: 2

Related Questions