Reputation: 25
I am trying to use XLNET through transformers. however i keep getting the issue "AttributeError: 'NoneType' object has no attribute 'tokenize'". I am unsure of how to proceed. if anyone could point me in the right direction it would be appreciated.
tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased', do_lower_case=True)
print(' Original: ', X_train[1])
# Print the tweet split into tokens.
print('Tokenized: ', tokenizer.tokenize(X_train[1]))
# Print the tweet mapped to token ids.
print('Token IDs: ', tokenizer.convert_tokens_to_ids(tokenizer.tokenize(X_train[1])))
Original: hey angel duh sexy really thanks haha
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-67-2b1b432b3e15> in <module>()
2
3 # Print the tweet split into tokens.
----> 4 print('Tokenized: ', tokenizer.tokenize(X_train[2]))
5
6 # Print the tweet mapped to token ids.
AttributeError: 'NoneType' object has no attribute 'tokenize'
Upvotes: 1
Views: 11711
Reputation: 61
Absolutely, @cronoik 's answer is the correct one. No doubt regarding it. But if you have installed the SenencePience package and still have the error, just restart the runtime environment and it will work
.
Upvotes: 0
Reputation: 19365
I assume that:
from transformers import XLNetTokenizerFast
tokenizer = XLNetTokenizerFast.from_pretrained('xlnet-base-cased', do_lower_case=True)
works? In this case, you are just missing the sentencepiece package:
pip install sentencepiece
Upvotes: 1