L Akshay
L Akshay

Reputation: 33

Bert Tokenizer is not working despite importing all packages. Is there a new syntax change to this?

Trying to run the tokenizer for Bert but I keep getting errors. Can anyone help where I am going wrong.

FullTokenizer = bert.bert_tokenization.FullTokenizer
bert_layer = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1", trainable=False)
vocab_file = bert_layer.resolved_object.vocab_file.asset_path.numpy()
do_lower_case = bert_layer.resolved_object.do_lower_case.numpy()
tokenizer = FullTokenizer(vocab_file, do_lower_case)

Error: AttributeError Traceback (most recent call last) in () ----> 1 FullTokenizer = bert.bert_tokenization.FullTokenizer 2 bert_layer = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1", 3 trainable=False) 4 vocab_file = bert_layer.resolved_object.vocab_file.asset_path.numpy() 5 do_lower_case = bert_layer.resolved_object.do_lower_case.numpy()

AttributeError: module 'bert' has no attribute 'bert_tokenization'

All the below have been imported for reference.

!pip install bert-for-tf2
!pip install sentencepiece
!pip install bert-tensorflow
!pip install tensorflow==2.0

try:
    %tensorflow_version 2.x
except Exception:
    pass
import tensorflow as to
import tensorflow_hub as hub
from tensorflow.keras import layers
import bert
from bert import tokenization

Upvotes: 2

Views: 3650

Answers (2)

Niranand Khedkar
Niranand Khedkar

Reputation: 1

!pip install bert-tensorflow
!pip install --upgrade bert
!pip install tokenization

from bert import tokenization
from bert.tokenization.bert_tokenization import FullTokenizer
tokenizer = FullTokenizer(vocab_file=vocab_file, do_lower_case=do_lower_case)

Upvotes: 0

Prithvi Shetty
Prithvi Shetty

Reputation: 61

I was caught up in a similar situation before.

Try looking for a folder named "bert" in the directory where your script/notebook is being run. Delete that folder or rename it to something other than "bert". There is a very likely possibility that when you import bert, it tries to access that folder intead of the bert-for-tf2 which you installed in the Python site packages.

If still that doesn't work, try

from bert import tokenization

Upvotes: 3

Related Questions