Reputation: 41
Title. I'm currently trying to run import a module that uses transformers but it throws the following error:
(tf2venv) dante@dante-Inspiron-5570:~/projects/classification$ inv process-pdf test.pdf
Using TensorFlow backend.
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/dante/projects/classification/venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Using TensorFlow backend.
Traceback (most recent call last):
File "/home/dante/projects/classification/venv/bin/inv", line 8, in <module>
sys.exit(program.run())
File "/home/dante/projects/classification/venv/lib/python3.7/site-packages/invoke/program.py", line 373, in run
self.parse_collection()
File "/home/dante/projects/classification/venv/lib/python3.7/site-packages/invoke/program.py", line 465, in parse_collection
self.load_collection()
File "/home/dante/projects/classification/venv/lib/python3.7/site-packages/invoke/program.py", line 696, in load_collection
module, parent = loader.load(coll_name)
File "/home/dante/projects/classification/venv/lib/python3.7/site-packages/invoke/loader.py", line 76, in load
module = imp.load_module(name, fd, path, desc)
File "/home/dante/.pyenv/versions/3.7.0/lib/python3.7/imp.py", line 235, in load_module
return load_source(name, filename, file)
File "/home/dante/.pyenv/versions/3.7.0/lib/python3.7/imp.py", line 172, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 696, in _load
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/dante/projects/classification/tasks.py", line 10, in <module>
from app import ClassifyDocument
File "/home/dante/projects/classification/app.py", line 15, in <module>
from docType_classification import classify, grouping, utils
File "/home/dante/projects/classification/docType_classification/classify.py", line 13, in <module>
import common.hybrid as hybrid
File "/home/dante/projects/classification/common/hybrid.py", line 3, in <module>
import transformers
ModuleNotFoundError: No module named 'transformers'
The code in common/hybrid.py
is as follows:
import transformers
from tokenizers import BertWordPieceTokenizer
import tqdm
import numpy as np
def build_tokenizer():
# load the real tokenizer
tokenizer = transformers.DistilBertTokenizer.from_pretrained(
"distilbert-base-uncased"
)
# Save the loaded tokenizer locally
tokenizer.save_pretrained(".")
# Reload it with the huggingface tokenizers library
hugging_face_tokenizer = BertWordPieceTokenizer("vocab.txt", lowercase=False)
return hugging_face_tokenizer
def encode(texts, tokenizer, chunk_size=256, maxlen=512):
tokenizer.enable_truncation(max_length=maxlen)
tokenizer.enable_padding(length=maxlen)
all_ids = []
print(len(texts))
for i in tqdm(range(0, len(texts), chunk_size)):
text_chunk = texts[i : i + chunk_size].tolist()
encs = tokenizer.encode_batch(text_chunk)
all_ids.extend([enc.ids for enc in encs])
return np.array(all_ids)
It is imported in classify.py
as:
import common.hybrid as hybrid
I'm able to compile and run this file with
python3 common/hybrid.py
without any errors.
When running an invoke task with
invoke process-data
the file tasks.py is located in the root project directory.
I get the ModuleNotFoundError
as soon as it reaches the transformers import.
Note that even when adding
import tensorflow
above the transformers import, this is imported correctly and the error isn't thrown until
import transformers
pip freeze
output:
absl-py==0.12.0
appdirs==1.4.4
astunparse==1.6.3
attrs==20.3.0
backcall==0.2.0
bearbones==2.300
black==20.8b1
boto3==1.9.85
botocore==1.12.253
cachetools==4.2.1
certifi==2020.12.5
cfgv==3.2.0
chardet==3.0.4
click==7.1.2
decorator==5.0.6
distlib==0.3.1
docutils==0.15.2
fancycompleter==0.9.1
filelock==3.0.12
flake8==3.9.0
fuzzysearch==0.7.3
gast==0.3.3
google-auth==1.28.1
google-auth-oauthlib==0.4.4
google-pasta==0.2.0
grpcio==1.37.0
h5py==2.10.0
identify==2.2.3
idna==2.8
invoke==1.5.0
ipython==7.14.0
ipython-genutils==0.2.0
isort==5.8.0
jedi==0.18.0
jmespath==0.10.0
joblib==1.0.1
Keras==2.3.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.2
lxml==4.6.3
Markdown==3.3.4
mccabe==0.6.1
more-itertools==8.7.0
mypy-extensions==0.4.3
nodeenv==1.6.0
numpy==1.20.2
oauthlib==3.1.0
opt-einsum==3.3.0
packaging==20.9
pandas==1.1.5
parso==0.8.2
pathspec==0.8.1
pdbpp==0.10.2
pdf2image==1.10.0
pdftotext==2.1.5
pexpect==4.8.0
pickleshare==0.7.5
pikepdf==1.7.1
Pillow==8.2.0
pipdeptree==2.0.0
pluggy==0.13.1
pre-commit==2.12.0
prompt-toolkit==3.0.18
protobuf==3.15.8
ptyprocess==0.7.0
py==1.10.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle==2.7.0
pyflakes==2.3.1
Pygments==2.8.1
pyparsing==2.4.7
PyPDF2==1.26.0
pyrepl==0.9.0
pytest==5.3.5
python-dateutil==2.8.1
pytz==2021.1
PyYAML==5.4.1
redis==3.3.11
regex==2021.4.4
requests==2.21.0
requests-oauthlib==1.3.0
rsa==4.7.2
s3transfer==0.1.13
sacremoses==0.0.44
scipy==1.4.1
sentencepiece==0.1.95
six==1.15.0
tenacity==6.0.0
tensorboard==2.2.2
tensorboard-plugin-wit==1.8.0
tensorflow==2.2.0
tensorflow-estimator==2.2.0
termcolor==1.1.0
tokenizers==0.10.2
toml==0.10.2
tqdm==4.60.0
traitlets==5.0.5
transformers==4.4.2
typed-ast==1.4.3
typing-extensions==3.7.4.3
urllib3==1.24.1
virtualenv==20.4.3
wcwidth==0.2.5
Werkzeug==1.0.1
wmctrl==0.3
wrapt==1.12.1
other info:
(tf2venv) dante@dante-Inspiron-5570:~/projects/classification$ which python
/home/dante/projects/classification/tf2venv/bin/python
(tf2venv) dante@dante-Inspiron-5570:~/projects/classification$ which inv
/home/dante/projects/classification/tf2venv/bin/inv
(tf2venv) dante@dante-Inspiron-5570:~/projects/classification$ python3 --version
Python 3.8.0
Note that there are no circular imports and I've tried various versions of transformers(v3-4)
Everything was installed with pip3
, the venv
was created with
python3 -m venv tf2venv
I've tried deleting the venv
and reinstalling various times but nothing works. Is there something missing that is causing this ModuleNotFoundError with transformers?
My requirements.txt
is
bearbones>=2
fuzzysearch~=0.7.3
ipython~=7.14.0
Keras~=2.3.0
pdf2image~=1.10.0
pikepdf~=1.7.0
tenacity~=6.0.0
tensorflow==2.2.0
transformers==3.0.2
pandas~=1.1.5
pytest~=5.3.2
pdftotext~=2.1.4
Upvotes: 4
Views: 1666