Reputation: 107
With Spacy version 3.0 there seem to be some changes with nlp.update. I am utterly confused with this simple code:
examples = TRAIN_DATA
random.shuffle(examples)
losses = {}
for batch in minibatch(examples, size=8):
nlp.update(batch, sgd=optimizer, drop=0.35, losses=losses)
When I do type(batch) it indicates that batch is of type list. But the error message says it is a tuple. I also tried to convert it to a list without success. What am I doing wrong?
The exact error is:
TypeError Traceback (most recent call last) in 22 23 for batch in minibatch(examples, size=8): ---> 24 nlp.update(batch, sgd=optimizer, drop=0.35, losses=losses) 25 26 print("Losses ({}/{})".format(epoch + 1, epochs), losses)
~/nlp_learn/statbot/.statbot/lib/python3.8/site-packages/spacy/language.py in update(self, examples, _, drop, sgd, losses, component_cfg, exclude) 1090 if len(examples) == 0: 1091 return losses -> 1092 validate_examples(examples, "Language.update") 1093 examples = _copy_examples(examples) 1094 if sgd is None:
~/nlp_learn/statbot/.statbot/lib/python3.8/site-packages/spacy/training/example.pyx in spacy.training.example.validate_examples()
TypeError: [E978] The Language.update method takes a list of Example objects, but got: {<class 'tuple'>}
Here the first line of TRAIN_DATA as an example: ('Auf Bauer Lehmanns Hof wird an beiden Pfingsttagen Brot im Backofen gebacken.', {'entities': [(10, 18, 'PER')]})
Upvotes: 4
Views: 3851
Reputation: 1782
You need to convert TRAIN_DATA
to Example
type. Probably the easiest way is using Example.from_dict()
method.
TRAIN_DATA = # your data
random.shuffle(TRAIN_DATA)
losses = {}
for batch in minibatch(TRAIN_DATA, size=8):
for text, annotations in batch:
doc = nlp.make_doc(text)
example = Example.from_dict(doc, annotations)
nlp.update([example], drop=0.35, sgd=optimizer, losses=losses)
Upvotes: 6