Shiva Sharma
Shiva Sharma

Reputation: 155

How do I fix ValueError when doing nlp.add_pipe(LanguageDetector(), name='language_detector', last=True) with spacy 3

Every time I run the following code I found on Kaggle, I get ValueError. This is because of new version v3 of SpaCy:

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe(LanguageDetector(), name='language_detector', last=True)

ValueError: [E966] nlp.add_pipe now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy_langdetect.spacy_langdetect.LanguageDetector object at 0x00000216BB4C8D30> (name: 'language_detector').

I have installed these versions:

python_version : 3.8.5
spacy.version  : '3.0.3'
scispacy.version  :  '0.4.0'
en_core_sci_lg.version  :  '0.4.0'

Upvotes: 9

Views: 14828

Answers (2)

Nicolas Mauti
Nicolas Mauti

Reputation: 546

You can also use a @Language.factory decorator to achieve the same result with less code :

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector
from spacy.language import Language

@Language.factory('language_detector')
def language_detector(nlp, name):
    return LanguageDetector()

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe('language_detector', last=True)

Upvotes: 11

polm23
polm23

Reputation: 15593

The way add_pipe works changed in v3; components have to be registered, and can then be added to a pipeline just using their name. In this case you have to wrap the LanguageDetector like so:

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector

from spacy.language import Language

def create_lang_detector(nlp, name):
    return LanguageDetector()

Language.factory("language_detector", func=create_lang_detector)

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe('language_detector', last=True)

You can read more about how this works in the spaCy docs.

Upvotes: 6

Related Questions