GitHunter0
GitHunter0

Reputation: 574

spaCy lemmatization (via .lemma_) is returning only empty strings

I cannot get spaCy lemmatization to work, it always returns empty strings.

import spacy  
from spacy.lang.en import English

nlp = English()  
text = "I went to the bank today for checking my bank balance."  
doc = nlp(text)   

This returns just empty strings:

for token in doc:  
    print(token.lemma_)

System info:

Windows 10 Pro 64bits
Python 3.8.8
spacy                         3.0.6
spacy-legacy                  3.0.5

Am I doing something wrong? I appreciate any input.

Upvotes: 1

Views: 1866

Answers (1)

polm23
polm23

Reputation: 15593

The lemma data is pretty large, so it's not included in the core spaCy install. You need to install an English model or the lookups data. You can download the small model like this:

spacy download en_core_web_sm

Then load the model.

import spacy
nlp = spacy.load("en_core_web_sm")

doc = nlp("cheeses")
print(doc[0].lemma_) # "cheese"

That should do it.

Upvotes: 2

Related Questions