wanncy
wanncy

Reputation: 77

How can I train spaCy entity link model using GPU?

When I train spaCy entity linking model follow the document wiki_entity_linking, and I found that model was trained using cpu. It costs very long time to train epoch. (About 3 days for 2 epochs in the environment: 16x cpu, 64GB mem)

The command is: python wikidata_train_entity_linker.py -t 50000 -d 10000 -o xxx. So my question is that how could I do to use GPU for the train phase.

Upvotes: 1

Views: 2450

Answers (1)

swartchris8
swartchris8

Reputation: 720

You will need to refactor the code to use spacy.require_gpu() before initialising your NLP models - for more information refer to the docs: https://spacy.io/api/top-level#spacy.require_gpu

Before doing this I would make sure your task is running on all cores. If you are not running on all cores you could use joblib for multiprocessing minibatch partitions of your job:

    partitions = minibatch(texts, size=batch_size)
    executor = Parallel(n_jobs=n_jobs, backend="multiprocessing", prefer="processes")
    do = delayed(partial(transform_texts, nlp))
    tasks = (do(i, batch, output_dir) for i, batch in enumerate(partitions))
    executor(tasks)

For more information here's a joblib multiprocessing NER training example from the docs: https://spacy.io/usage/examples#multi-processing

Upvotes: 2

Related Questions