Reputation: 115
I'm trying to load the en_core_web_sm
spaCy model, but I have been unsuccessful in doing so.
The error that occurs is the following:
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.
I'm working in a Anaconda virtual environment. The following checkboxes are ticked:
conda activate gcp-env
prior to installing spaCy and the english language modelconda install -c conda-forge spacy
while on the right environmentpython -m spacy download en
, still while on the right environmentspacy
to the requirements.txt
, and installing dependencies via that route, after first attempts failedspacy info
produces this output:
spacy info
============================== Info about spaCy ==============================
spaCy version 3.3.0
Location /Users/simonmortensen/opt/anaconda3/envs/gcp-env/lib/python3.10/site-packages/spacy
Platform macOS-11.6.5-x86_64-i386-64bit
Python version 3.10.4
Pipelines en_core_web_sm (3.3.0)
python -m spacy validate
produces this output:
================= Installed pipeline packages (spaCy v3.3.0) =================
ℹ spaCy installation:
/Users/simonmortensen/opt/anaconda3/envs/gcp-env/lib/python3.10/site-packages/spacy
NAME SPACY VERSION
en_core_web_sm >=3.3.0.dev0,<3.4.0 3.3.0 ✔
I've been through several previous StackOverflow posts on the same topic. Those have often been solved, but my issue remains.
Any advice would be very much appreciated. Thanks in advance!
Simon
EDIT:
For additional context, pip list
on the environment contains both
spacy 3.3.0
spacy-legacy 3.0.9
spacy-loggers 1.0.2
and
en-core-web-sm 3.3.0
Even so, import en_core_web_sm
also doesn't work:
import en_core_web_sm
Traceback (most recent call last):
Input In [65] in <cell line: 1>
import en_core_web_sm
ModuleNotFoundError: No module named 'en_core_web_sm'
Upvotes: 2
Views: 3259
Reputation: 149
It should be noted that spaCy models are essentially ordinary Python packages. So you can first check if they exist, and if not, download and install them. The following function tries to download only if it is really necessary...
import spacy
def install_spacy_model(spacy_model_id: str="en_core_web_sm", quiet: bool=False):
# Initiate the download only if it is really necessary (i.e. if the language pack is not installed).
if spacy.util.is_package(spacy_model_id):
print(f"spaCy model: [{spacy_model_id}] is already installed and will therefore not be downloaded again.")
else:
print(f"spaCy model: [{spacy_model_id}] was not found. Download initiated...")
# Suppress pip installation messages on request.
if quiet:
spacy.cli.download(spacy_model_id, False, False, "--quiet")
else:
spacy.cli.download(spacy_model_id, False, False)
And now just call it..
install_spacy_model()
Result:
>> spaCy model: [en_core_web_sm] was not found. Download initiated...
>> ✔ Download and installation successful
>> You can now load the package via spacy.load('en_core_web_sm')
Upvotes: 0
Reputation: 115
Spyder was the villain.
All packages were correctly installed on the virtual environment, but Spyder was not running that environment (even if the IDE was launched with the spyder
command from a terminal where the environment was in fact activated).
In order to make Spyder run the correct environment, you needed to change the Python interpreter in the Spyder preferences:
... and then restart the kernel.
I got an error prompting me to pip install spyder-kernels==2.1.*
, but once that was done (make sure to do it on the right venv), I restarted Spyder, and it finally worked!
See discussions in thread: https://github.com/explosion/spaCy/discussions/10895.
Upvotes: 2
Reputation: 1
I think it is not taking the good path of the enviroment.
In a terminal exceute which python , and you have to validate is taking the enviorement paths.
Upvotes: 0