Ziqi
Ziqi

Reputation: 2544

Model name 'bert-base-uncased' was not found in tokenizers

My code that loads a pre-trained BERT model has been working alright until today I moved it to another, new server. I set up the environment properly, then when loading the 'bert-base-uncased' model, I got this error

Traceback (most recent call last):
  File "/jmain02/home/J2AD003/txk64/zzz70-txk64/.conda/envs/tensorflow-gpu/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/jmain02/home/J2AD003/txk64/zzz70-txk64/.conda/envs/tensorflow-gpu/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/jmain02/home/J2AD003/txk64/zzz70-txk64/wop_bert/code/python/src/exp/run_exp_bert_apply.py", line 74, in <module>
    input_text_fields)
  File "/jmain02/home/J2AD003/txk64/zzz70-txk64/wop_bert/code/python/src/classifier/classifier_bert_.py", line 556, in fit_bert_trainonly
    tokenizer = BertTokenizer.from_pretrained(bert_model, do_lower_case=True)
  File "/jmain02/home/J2AD003/txk64/zzz70-txk64/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1140, in from_pretrained
    return cls._from_pretrained(*inputs, **kwargs)
  File "/jmain02/home/J2AD003/txk64/zzz70-txk64/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1246, in _from_pretrained
    list(cls.vocab_files_names.values()),
OSError: Model name 'bert-base-uncased' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). We assumed 'bert-base-uncased' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this path or url.

And the line that triggered this error (classifier_bert_.py line 556) is very simple:

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)

Please can I have some help on how to solve this issue?

Thanks

Upvotes: 0

Views: 5383

Answers (1)

Mohit Reddy
Mohit Reddy

Reputation: 141

You have to download it and put in the same directory:

You can download it from here: https://huggingface.co/bert-base-uncased

Upvotes: 1

Related Questions