Why does Spacy 3 NER use different pipeline for GPU vs CPU?

Question

Spacy 'train' command uses a command line option --gpu 0, allowing a 'last minute' choice between training with GPU and without it - using CPU only.

However, using the https://spacy.io/usage/training#quickstart to choose between GPU and CPU results in a major difference in (basic) configuration. In my case (dealing with NER), I get two different pipelines:

for CPU: pipeline = ["tok2vec","ner"]
for GPU: pipeline = ["transformer","ner"]

(with a very different following component setup).

Since my GPU has only 6GB of memory, I run out of GPU memory fairly fast - can't use it. But when I switch to using CPU only, the training behavior between the two pipelines is vastly different:

The ["tok2vec","ner"] pipeline runs pretty much on a single core, training my model (8,000 training, 2000 dev/validation docs) in couple hours. Notably faster than Spacy 2 (even with GPU), though at times using a lot of memory (up to 30G).

The ["transformer","ner"] pipeline explodes into using up to 20 cores (on a 40 logical core machine), so I would expect it to run fast. But it appears to run forever. In an hour I only get the first 'epoch' completed, then (on the next epoch) it crashes (see below). Since my data (DocBin files batching 100 'documents' each) is the same, the crash below (out-of-sequence B/I tag) is hard to explain.

My main question is WHY is the pipeline different when targeting GPU vs CPU? Where are the vectors in case of targeting GPU?

Crash: ...

 File "C:\Work\ML\Spacy3\lib\site-packages\spacy	raining\loop.py", line 98, in train
    for batch, info, is_best_checkpoint in training_step_iterator:
  File "C:\Work\ML\Spacy3\lib\site-packages\spacy	raining\loop.py", line 194, in train_while_improving
    nlp.update(
  File "C:\Work\ML\Spacy3\lib\site-packages\spacy\language.py", line 1107, in update
    proc.update(examples, sgd=None, losses=losses, **component_cfg[name])
  File "spacy\pipeline	ransition_parser.pyx", line 350, in spacy.pipeline.transition_parser.Parser.update
  File "spacy\pipeline	ransition_parser.pyx", line 604, in spacy.pipeline.transition_parser.Parser._init_gold_batch
  File "spacy\pipeline\_parser_internals
er.pyx", line 273, in spacy.pipeline._parser_internals.ner.BiluoPushDown.init_gold
  File "spacy\pipeline\_parser_internals
er.pyx", line 53, in spacy.pipeline._parser_internals.ner.BiluoGold.__init__
  File "spacy\pipeline\_parser_internals
er.pyx", line 69, in spacy.pipeline._parser_internals.ner.create_gold_state
  File "spacy	raining\example.pyx", line 240, in spacy.training.example.Example.get_aligned_ner
  File "spacy	okens\doc.pyx", line 698, in spacy.tokens.doc.Doc.ents.__get__
ValueError: [E093] token.ent_iob values make invalid sequence: I without B

Why does Spacy 3 NER use different pipeline for GPU vs CPU?

Answers (1)

Related Questions