SpaCy 3 -- ValueError: [E973] Unexpected type for NER data

Question

I've been stressing out on this problem for so long and I can't seem to find a solution. I want to train a NER model to recognise animal and species names. I created a mock training set to test it out. However, I keep getting a ValueError: [E973] Unexpected type for NER data

I have tried other solutions on other posts on StackOverflow, including:

Double checking if my formatting and type of the training set was right
Using spacy.load('en_core_web_sm') instead of spacy.blank('en')
Installing spacy-lookups-data

All of these result in the same error.

import os
import spacy
from spacy.lang.en import English
from spacy.training.example import Example
import random


def train_spacy(data, iterations = 30):
    TRAIN_DATA = data

    nlp = spacy.blank("en") #start with a blank model

    if "ner" not in nlp.pipe_names:
        ner = nlp.add_pipe("ner", last = True)

    for _, annotations in TRAIN_DATA:
        for ent in annotations.get("entities"):
            ner.add_label(ent[2])

    other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "ner"]
    
    with nlp.disable_pipes(*other_pipes):
        optimizer = nlp.begin_training()
        for itn in range(iterations):
            print ("Starting iterations "+str(itn))
            random.shuffle(TRAIN_DATA)
            losses = {}

            for text, annotations in TRAIN_DATA:
                doc = nlp.make_doc(text)

                print(isinstance(annotations["entities"], (list,tuple))) #this prints True

                example = Example.from_dict(doc, {"entities":annotations})
                nlp.update(
                    [example],
                    drop = 0.2,
                    sgd = optimizer,
                    losses = losses
                )
        print(losses)
    return (nlp)

if __name__ == "__main__":
    #mock training set
    TRAIN_DATA=[('Dog is an animal',{'entities':[(0,3,'ANIMAL')]}),
           ('Cat is on the table',{'entities':[(0,3,'ANIMAL')]}),
           ('Rats are pets',{'entities':[(0,4,'ANIMAL')]})]
    nlp = train_spacy(TRAIN_DATA)

The error message

  File "c:\...\summarizer\src\feature_extraction\feature_extraction.py", line 49, in 
    nlp = train_spacy(TRAIN_DATA)
  File "c:\...\summarizer\src\feature_extraction\feature_extraction.py", line 35, in train_spacy
    example = Example.from_dict(doc, {"entities":annotations})
  File "spacy	raining\example.pyx", line 118, in spacy.training.example.Example.from_dict
  File "spacy	raining\example.pyx", line 24, in spacy.training.example.annotations_to_doc
  File "spacy	raining\example.pyx", line 388, in spacy.training.example._add_entities_to_doc
ValueError: [E973] Unexpected type for NER data```

SpaCy 3 -- ValueError: [E973] Unexpected type for NER data

Answers (1)

Related Questions