Reputation: 259
I am trying to create a custom entity label called FRUIT using the rule-based Matcher (i.e. adding on_match rules), following the spaCy guide. I'm using spaCy 2.0.11, so I believe the steps to do so have changed compared to spaCy 1.X
Example: doc = nlp('Tom wants to eat some apples at the United Nations')
Expected text and entity outputs:
Tom PERSON
apples FRUIT
the United Nations ORG
However, I seem to get the following error: [E084] Error assigning label ID 7429577500961755728 to span: not in StringStore. I have included my code below. When I change nlp.vocab.strings['FRUIT'] to nlp.vocab.strings['EVENT'], strangely it works but apples would be assigned the entity label EVENT. Anyone else encountering this issue?
doc = nlp('Tom wants to eat some apples at the United Nations')
FRUIT = nlp.vocab.strings['FRUIT']
def add_ent(matcher, doc, i, matches):
# Get the current match and create tuple of entity label, start and end.
# Append entity to the doc's entity. (Don't overwrite doc.ents!)
match_id, start, end = matches[i]
doc.ents += ((FRUIT, start, end),)
matcher = Matcher(nlp.vocab)
pattern = [{'LOWER': 'apples'}]
matcher.add('AddApple', add_ent, pattern)
matches = matcher(doc)
for ent in doc.ents:
print(ent.text, ent.label_)
Upvotes: 0
Views: 882
Reputation: 259
Oh okay, I think I found a solution. The label has to be added to nlp.vocab.strings if it is not there:
nlp.vocab.strings.add('FRUIT')
Upvotes: 5