Bart Boerman
Bart Boerman

Reputation: 51

spacy: add lemmatizer lookup for Dutch (nl) language

I'am using Spacy 2.0.11 with Dutch language model nl_core_news_sm (nl). How can I add the lemmatization lookup similar to the implementation for German (de)?

I tried the following steps:

This resulted in the following error after 'nlp = nl_core_news_sm.load()' or 'from spacy.lang.nl import Dutch':

ModuleNotFoundError: No module named 'spacy.lang.nl.lemmatizer' ImportError: [E048] Can't import language nl from spacy.lang

Upvotes: 2

Views: 2074

Answers (2)

Ines Montani
Ines Montani

Reputation: 7105

In theory, your approach is correct – if you copy exactly how it's implemented in German and other languages that implement the lookup, it should work.

I suspect your problem here is actually a different one: According to the error message, it can't actually find the spacy.lang.nl.lemmatizer module, so spaCy now fails to import the Dutch language class. Are you sure the lemmatizer.py file exists in the correct place, and is imported correctly? (If you're not doing it already, I'd also recommend running your development installation in a separate environment and build spaCy from source, to make sure there are no weird conflicts).

Upvotes: 1

Jacopofar
Jacopofar

Reputation: 3507

I'm afraid that is not possible, the english model includes a lemmatizer (see here) and the Dutch one does not (here).

it is a component hand-written based on the morphology of the language, so while Spacy has models for Dutch this specific function is not there .

Upvotes: 0

Related Questions