Reputation: 705
I'm trying to use the spacy_langdetect package and the only example code I can find is (https://spacy.io/universe/project/spacy-langdetect):
import spacy
from spacy_langdetect import LanguageDetector
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe(LanguageDetector(), name='language_detector', last=True)
text = 'This is an english text.'
doc = nlp(text)
print(doc._.language)
It's throwing error:
nlp.add_pipe
now takes the string name of the registered component factory, not a callable component.
So I tried using the below for adding to my nlp pipeline
language_detector = LanguageDetector()
nlp.add_pipe("language_detector")
But this gives error:
Can't find factory for 'language_detector' for language English (en). This usually happens when spaCy calls
nlp.create_pipe
with a custom component name that's not registered on the current language class. If you're using a Transformer, make sure to install 'spacy-transformers'. If you're using a custom component, make sure you've added the decorator@Language.component
(for function components) or@Language.factory
(for class components). Available factories: attribute_ruler, tok2vec, merge_noun_chunks, merge_entities, merge_subtokens, token_splitter, parser, beam_parser, entity_linker, ner, beam_ner, entity_ruler, lemmatizer, tagger, morphologizer, senter, sentencizer, textcat, textcat_multilabel, en.lemmatizer
I don't fully understand how to add it since it's not really a custom component.
Upvotes: 16
Views: 16916
Reputation: 417
Same as @Eric but registered with the factory decorator:
import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector
@Language.factory("language_detector")
def get_lang_detector(nlp, name):
return LanguageDetector()
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe('language_detector', last=True)
print(nlp("This is an english text.")._.language)
Upvotes: 8
Reputation: 486
With spaCy v3.0 for components not built-in such as LanguageDetector, you will have to wrap it into a function prior to adding it to the nlp pipe. In your example, you can do the following:
import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector
def get_lang_detector(nlp, name):
return LanguageDetector()
nlp = spacy.load("en_core_web_sm")
Language.factory("language_detector", func=get_lang_detector)
nlp.add_pipe('language_detector', last=True)
text = 'This is an english text.'
doc = nlp(text)
print(doc._.language)
For built-in components (i.e. tagger, parser, ner, etc.), see: https://spacy.io/usage/processing-pipelines
Upvotes: 37