Extending Lemma Lookup Table in Spacy

Question

I am currently processing texts with the NLP library Spacy. Spacy, however, does not lemmatize all words correctly, therefore I want to extend the lookup table. Currently I am merging Spacy's constant lookup table with my extension and subsequently overwrite Spacy's native lookup table.

I have the feeling, however, that this approach may not be the best and most consistent one.

Question: Is there another possibility to update the lookup table in Spacy, e.g. an update or extend function? I have read the Docs and could not find something like that. Or is this approach "just fine"?

Working example of my current approach:

import spacy
nlp = spacy.load('de')
Spacy_lookup = spacy.lang.de.LOOKUP
New_lookup = {'AAA':'Anonyme Affen Allianz','BBB':'Berliner Bauern Bund','CCC':'Chaos Chaoten Club'}
Spacy_lookup.update(New_lookup)
spacy.lang.de.LOOKUP = Spacy_lookup
tagged = nlp("Die AAA besiegt die BBB und den CCC unverdient.")
[ print(each.lemma_) for each in tagged]

Die
Anonyme Affen Allianz
besiegen
der
Berliner Bauern Bund
und
der
Chaos Chaoten Club
unverdient
.

Extending Lemma Lookup Table in Spacy

Answers (1)

Related Questions