AttributeError: 'NoneType' object has no attribute 'tokenize'

Question

I am trying to use XLNET through transformers. however i keep getting the issue "AttributeError: 'NoneType' object has no attribute 'tokenize'". I am unsure of how to proceed. if anyone could point me in the right direction it would be appreciated.

tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased', do_lower_case=True)

print(' Original: ', X_train[1])

# Print the tweet split into tokens.
print('Tokenized: ', tokenizer.tokenize(X_train[1]))

# Print the tweet mapped to token ids.
print('Token IDs: ', tokenizer.convert_tokens_to_ids(tokenizer.tokenize(X_train[1])))




Original:  hey angel duh sexy really thanks haha
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
 in ()
      2 
      3 # Print the tweet split into tokens.
----> 4 print('Tokenized: ', tokenizer.tokenize(X_train[2]))
      5 
      6 # Print the tweet mapped to token ids.

AttributeError: 'NoneType' object has no attribute 'tokenize'

cronoik · Accepted Answer

I assume that:

from transformers import XLNetTokenizerFast
tokenizer = XLNetTokenizerFast.from_pretrained('xlnet-base-cased', do_lower_case=True)

works? In this case, you are just missing the sentencepiece package:

pip install sentencepiece

AttributeError: 'NoneType' object has no attribute 'tokenize'

Answers (2)

In case SenencePience is installed and still have the error

Related Questions

AttributeError: &#39;NoneType&#39; object has no attribute &#39;tokenize&#39;

Answers (2)

In case SenencePience is installed and still have the error

Related Questions

AttributeError: 'NoneType' object has no attribute 'tokenize'