spacy how to add patterns to existing Entity ruler?

Question

My spacy version is 2.3.7. I have an existing trained custom NER model with NER and Entity Ruler pipes. I want to update and retrain this existing pipeline.

The code to create the entity ruler pipe was as follows-

ruler = EntityRuler(nlp)
for i in patt_dict:
  ruler.add_patterns(i)
nlp.add_pipe(ruler, name = "entity_ruler")

Where patt_dict is the original patterns dictionary I had made.

Now, after finishing the training, now I have more input data and want to train the model more with the new input data.

How can I modify the above code to add more of patterns dictionary to the entity ruler when I load the spacy model later and want to retrain it with more input data?

polm23 · Accepted Answer

It is generally better to retrain from scratch. If you train only on new data you are likely to run into "catastrophic forgetting", where the model forgets anything not in the new data.

This is covered in detail in this spaCy blog post. As of v3 the approach outlined there is available in spaCy, but it's still experimental and needs some work. In any case, it's still kind of a workaround, and the best thing is to train from scratch with all data.

spacy how to add patterns to existing Entity ruler?

Answers (2)

Related Questions