Reputation: 33
I want to use the huggingface datasets library from within a Jupyter notebook.
This should be as simple as installing it (pip install datasets
, in bash within a venv) and importing it (import datasets
, in Python or notebook).
All works well when I test it in the standard Python interactive shell, however, when trying in a Jupyter notebook, it says:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-6-652e886d387f> in <module>
----> 1 import datasets
ModuleNotFoundError: No module named 'datasets'
At first, I thought it might be the case that the notebook kernel uses a different virtual environment, but I verified from within the notebook that the package is installed:
!pip install datasets
Requirement already satisfied: datasets in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (1.8.0)
Requirement already satisfied: numpy>=1.17 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (1.21.0)
Requirement already satisfied: xxhash in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (2.0.2)
Requirement already satisfied: pyarrow<4.0.0,>=1.0.0 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (3.0.0)
Requirement already satisfied: pandas in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (1.2.5)
Requirement already satisfied: fsspec in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (2021.6.1)
Requirement already satisfied: packaging in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (20.9)
Requirement already satisfied: dill in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (0.3.4)
Requirement already satisfied: requests>=2.19.0 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (2.25.1)
Requirement already satisfied: tqdm<4.50.0,>=4.27 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (4.49.0)
Requirement already satisfied: multiprocess in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (0.70.12.2)
Requirement already satisfied: huggingface-hub<0.1.0 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from datasets) (0.0.13)
Requirement already satisfied: pytz>=2017.3 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from pandas->datasets) (2021.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from pandas->datasets) (2.8.1)
Requirement already satisfied: pyparsing>=2.0.2 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from packaging->datasets) (2.4.7)
Requirement already satisfied: certifi>=2017.4.17 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (2021.5.30)
Requirement already satisfied: chardet<5,>=3.0.2 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (4.0.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (1.26.6)
Requirement already satisfied: idna<3,>=2.5 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from requests>=2.19.0->datasets) (2.10)
Requirement already satisfied: typing-extensions in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from huggingface-hub<0.1.0->datasets) (3.10.0.0)
Requirement already satisfied: filelock in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from huggingface-hub<0.1.0->datasets) (3.0.12)
Requirement already satisfied: six>=1.5 in /home/yoga/venvs/text_embeddings/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas->datasets) (1.16.0)
and
!pip freeze
certifi==2021.5.30
chardet==4.0.0
datasets==1.8.0
dill==0.3.4
filelock==3.0.12
fsspec==2021.6.1
huggingface-hub==0.0.13
idna==2.10
multiprocess==0.70.12.2
numpy==1.21.0
packaging==20.9
pandas==1.2.5
pyarrow==3.0.0
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2021.1
requests==2.25.1
six==1.16.0
tqdm==4.49.0
typing-extensions==3.10.0.0
urllib3==1.26.6
xxhash==2.0.2
Any ideas? Do I need to configure the notebook in a special way, or is there a problem with the datasets module? Thanks!
Edit: Following the answer below, this makes the error go away:
datasets_dir=r"/home/yoga/venvs/text_embeddings/lib/python3.8/site-packages/datasets"
import sys
sys.path.append(datasets_dir)
import datasets
But is there a way that works without setting this path explicitely? (Or can somebody explain why this is necessary here?)
Upvotes: 2
Views: 11484
Reputation: 11
I installed datasets via Jupyter Notebook with the command: !pip install datasets
Upvotes: 0
Reputation: 548
I had faced similar problem but with another library, this worked for me
import sys
sys.path.append(r"path to datasets in python env")
import dataset_utils
Path in your case -> "/home/yoga/venvs/text_embeddings/lib/python3.8/site-packages/datasets"
My guess is that the environment variable does not has the PYTHONPATH
is not set up correctly. PYTHONPATH is an environment variable those content is added to the sys.path where Python looks for modules. You can set it to whatever you like
This should work!!
Upvotes: 1