Reputation: 509
I am trying to extract all the noun phrases from French sentences using Spacy. My code appears not to be working well in all the cases I tried. For example,
nlp = spacy.load("fr_core_news_sm")
doc = nlp("Il y a plusieurs petits restaurants dans cette ville.")
for chunk in doc.noun_chunks:
print(chunk)
returns
[Il y a plusieurs petits restaurants dans cette ville.]
as the noun phrase, this appears to be incorrect as the noun phrase here is petits restaurants dans cette ville
.
When I tried other sets of phrases, such as J'ai trouvé une jolie petite chambre.
, it returned 3 phrases, [J' , une jolie, petite chambre]
which seems not to be correct also
Lastly, with Les deux dernières semaines, il était à Paris..
it returned [Les deux dernières semaines, il]
which appears to be correct.
I would appreciate any help or guidance on how to ensure the code works correctly for the first two examples also.
Upvotes: 0
Views: 447
Reputation: 2126
First try updating your version of SpaCy
pip install spacy --upgrade
Change your model from small fr_core_news_sm
to a larger one such as fr_core_news_lg
To install:
-python -m spacy download fr_core_news_lg
or directly pip install from SpaCy's model repository e.g.
pip install https://github.com/explosion/spacy-models/releases/download/fr_core_news_lg-2.3.0/fr_core_news_lg-2.3.0.tar.gz
Larger models typically have better accuracy on most NLP tasks.
Upvotes: 1