zzzz
zzzz

Reputation: 71

spacy 3 train custom ner model

I tried to train datasets :

[('text data text data text data text data text data text data text data text data.', {'entities': [(7, 19, 'PERSON'), (89, 91, 'PERSON'), (98, 101, 'PERSON')]}), ('"text data text data text data text data text data text data text data text data text data text data text data text data.', {'entities': [(119, 137, 'PERSON')]}),]

n_iter = 8
nlp = spacy.blank('en')
ner = nlp.create_pipe('ner')
for _, annotations in TRAIN_DATA:
    for _s, _e, label in annotations.get('entities', []) :
        print('Adding label - "', label, '"')
        ner.add_label(label)

from spacy.training.example import Example
from spacy.util import minibatch, compounding

other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
with nlp.disable_pipes(*other_pipes): 
    optimizer = nlp.begin_training()
    for itn in range(n_iter):
        random.shuffle(TRAIN_DATA)
        losses = {}
        for batch in spacy.util.minibatch(TRAIN_DATA, size=compounding(4.0, 32.0, 1.001)):
            for text, annotations in batch:
                doc = nlp.make_doc(text)
                example = Example.from_dict(doc, annotations)
                nlp.update([example], drop=0.35,losses=losses, sgd=optimizer)
            print('losses -', losses)

The result was losses - {}

till so many iterations

Does anyone know which one is wrong?

Upvotes: 3

Views: 1046

Answers (2)

Kasun Imesha
Kasun Imesha

Reputation: 11

Now you can both initialize and add the components usingadd_pipe method in spacy 3.X. This is discussed in this issue as well.

Upvotes: 0

aab
aab

Reputation: 11474

You haven't actually added the NER component to your pipeline. Replace nlp.create_pipe with:

ner = nlp.add_pipe("ner")

(Be aware that you're training on individual examples rather than batches of examples in this setup, so the batching code isn't doing anything useful. Have a look at the NER demo projects for more examples of how to do this with the train CLI, which has a more flexible and optimized training loop.)

Upvotes: 2

Related Questions