Reputation: 749
Currently working through a Deep Learning example and they are using a Tokenizer package. I am getting the following error:
AttributeError: 'Tokenizer' object has no attribute 'word_index'
Here is my code:
from keras.preprocessing.text import Tokenizer
samples = ['The cat say on the mat.', 'The dog ate my homework.']
tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_sequences(samples)
sequences = tokenizer.texts_to_sequences(samples)
one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
Could anyone help me catch my mistake?
Upvotes: 3
Views: 18256
Reputation: 466
It appears it is importing correctly, but the Tokenizer
object has no attribute word_index
.
According to the documentation that attribute will only be set once you call the method fits_on_text
on the Tokenizer
object.
The following code runs successfully:
from keras.preprocessing.text import Tokenizer
samples = ['The cat say on the mat.', 'The dog ate my homework.']
tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_texts(samples)
one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
Upvotes: 5