Reputation: 155
Every time I run the following code I found on Kaggle, I get ValueError
. This is because of new version v3 of SpaCy:
import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector
nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe(LanguageDetector(), name='language_detector', last=True)
ValueError: [E966] nlp.add_pipe
now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy_langdetect.spacy_langdetect.LanguageDetector object at 0x00000216BB4C8D30> (name: 'language_detector').
If you created your component with nlp.create_pipe('name')
: remove nlp.create_pipe and call nlp.add_pipe('name')
instead.
If you passed in a component like TextCategorizer()
: call nlp.add_pipe
with the string name instead, e.g. nlp.add_pipe('textcat')
.
If you're using a custom component: Add the decorator @Language.component
(for function components) or @Language.factory
(for class components / factories) to your custom component and assign it a name, e.g. @Language.component('your_name')
. You can then run nlp.add_pipe('your_name')
to add it to the pipeline.
I have installed these versions:
python_version : 3.8.5
spacy.version : '3.0.3'
scispacy.version : '0.4.0'
en_core_sci_lg.version : '0.4.0'
Upvotes: 9
Views: 14828
Reputation: 546
You can also use a @Language.factory
decorator to achieve the same result with less code :
import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector
from spacy.language import Language
@Language.factory('language_detector')
def language_detector(nlp, name):
return LanguageDetector()
nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe('language_detector', last=True)
Upvotes: 11
Reputation: 15593
The way add_pipe
works changed in v3; components have to be registered, and can then be added to a pipeline just using their name. In this case you have to wrap the LanguageDetector like so:
import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector
from spacy.language import Language
def create_lang_detector(nlp, name):
return LanguageDetector()
Language.factory("language_detector", func=create_lang_detector)
nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe('language_detector', last=True)
You can read more about how this works in the spaCy docs.
Upvotes: 6