Training custom NER model

Question

I have been training my NER model on some text and trying to find cities in that with custom entities.

Example:-

    ('paragraph Designated Offices Party A New York Party B Delaware paragraph pricing source calculation Market Value shall generally accepted pricing source reasonably agreed parties paragraph Spot rate Spot Rate specified paragraph reasonably agreed parties',
  {'entities': [(37, 41, 'DesignatedBankLoc'),(54, 62, 'CounterpartyBankLoc')]})

I am looking for 2 entities here DesignatedBankLoc and CounterpartyBankLoc. There can be multiple entities also for individual text.

currently I am training on 60 rows of data as follows:

import spacy
import random
def train_spacy(data,iterations):
    TRAIN_DATA = data
    nlp = spacy.blank('en')  # create blank Language class
    # create the built-in pipeline components and add them to the pipeline
    # nlp.create_pipe works for built-ins that are registered with spaCy
    if 'ner' not in nlp.pipe_names:
        ner = nlp.create_pipe('ner')
        nlp.add_pipe(ner, last=True)


    # add labels
    for _, annotations in TRAIN_DATA:
         for ent in annotations.get('entities'):
            # print (ent[2])
            ner.add_label(ent[2])

    # get names of other pipes to disable them during training
    other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
    with nlp.disable_pipes(*other_pipes):  # only train NER
        optimizer = nlp.begin_training()
        for itn in range(iterations):
            print("Statring iteration " + str(itn))
            random.shuffle(TRAIN_DATA)
            losses = {}
            for text, annotations in TRAIN_DATA:
                nlp.update(
                    [text],  # batch of texts
                    [annotations],  # batch of annotations
                    drop=0.5,  # dropout - make it harder to memorise data
                    sgd=optimizer,  # callable to update weights
                    losses=losses)
            print(losses)
    return nlp


prdnlp = train_spacy(TRAIN_DATA, 100)

My problem is:-

Model is predicting correct when input is different/same pattern of text contains trained cities. Model is not predicting for any of the entities even if same/different pattern of text but different cities which never occurs in training data set.

Please suggest me why it is happening please make me understand the concept how it is getting train?

Training custom NER model

Answers (1)

Related Questions