Reputation: 587
I'm trying to create and train a Sequential model like so:
def model(training: Dataset, validation: Dataset):
model = Sequential(layers=[Embedding(input_dim=1001, output_dim=16), Dropout(0.2), GlobalAveragePooling1D(), Dropout(0.2), Dense(1)])
model.compile(loss=BinaryCrossentropy(from_logits=True), optimizer='adam', metrics=BinaryAccuracy(threshold=0.0))
model.fit(x=training, validation_data=validation, epochs=10)
When I run it, I get the following error the model.fit
line:
ValueError: No gradients provided for any variable: ['embedding/embeddings:0', 'dense/kernel:0', 'dense/bias:0'].
I've come across some answers talking about the use of optimizers, but how would that apply to Sequential
rather than Model
? Is there something else that I'm missing?
Edit: The result of print(training)
:
<MapDataset shapes: ((None, 250), (None,)), types: (tf.int64, tf.int32)>
Edit: A script that will reproduce the error using IMDB sample data
from tensorflow.keras import Sequential
from tensorflow import data
from keras.layers import TextVectorization
import tensorflow as tf
from tensorflow.keras.layers import Embedding, Dropout, GlobalAveragePooling1D, Dense
from tensorflow.keras.metrics import BinaryAccuracy, BinaryCrossentropy
import os
def split_dataset(dataset: data.Dataset):
record_count = len(list(dataset))
training_count = int((70 / 100) * record_count)
validation_count = int((15 / 100) * record_count)
raw_train_ds = dataset.take(training_count)
raw_val_ds = dataset.skip(training_count).take(validation_count)
raw_test_ds = dataset.skip(training_count + validation_count)
return {"train": raw_train_ds, "test": raw_test_ds, "validate": raw_val_ds}
def clean(text, label):
return tf.strings.unicode_transcode(text, "US ASCII", "UTF-8")
def vectorize_dataset(dataset: data.Dataset):
return dataset.map(vectorize_text)
def vectorize_text(text, label):
text = tf.expand_dims(text, -1)
return vectorize_layer(text), label
url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"
dataset_tar = tf.keras.utils.get_file("aclImdb_v1", url,
untar=True, cache_dir='.',
cache_subdir='')
dataset_dir = os.path.join(os.path.dirname(dataset_tar), 'aclImdb')
batch_size = 32
seed = 42
dataset = tf.keras.preprocessing.text_dataset_from_directory(
'aclImdb/train',
batch_size=batch_size,
validation_split=0.2,
subset='training',
seed=seed)
split_data = split_dataset(dataset)
raw_train = split_data['train']
raw_val = split_data['validate']
raw_test = split_data['test']
vectorize_layer = TextVectorization(max_tokens=10000, output_mode="int", output_sequence_length=250, ngrams=1)
cleaned_text = raw_train.map(clean)
vectorize_layer.adapt(cleaned_text)
train = vectorize_dataset(raw_train)
test = vectorize_dataset(raw_test)
validate = vectorize_dataset(raw_val)
def model(training, validation):
sequential_model = Sequential(
layers=[Embedding(input_dim=1001, output_dim=16), Dropout(0.2), GlobalAveragePooling1D(), Dropout(0.2),
Dense(1)])
sequential_model.compile(loss=BinaryCrossentropy(from_logits=True), optimizer='adam', metrics=BinaryAccuracy(threshold=0.0))
sequential_model.fit(x=training, validation_data=validation, epochs=10)
model(train, validate)
Upvotes: 0
Views: 869
Reputation: 743
The problem in your code is occurring at below line:
vectorize_layer = TextVectorization(max_tokens=10000, output_mode="int", output_sequence_length=250, ngrams=1)
The max_tokens
in the TextVectorization layer corresponds to the total number of unique words
in the vocabulary.
Embedding Layer: The Embedding layer can be understood as a lookup table that maps from integer indices (which stand for specific words) to dense vectors (their embeddings)
.
In your code, the Embedding dimensions
are (1001,16)
that means you are only accomodating the integers that map the specific words in a range of 1001, any indices that forms a (row, column) pair, which corresponds to a value greater than 1001 are not taken care off. Therefore, the ValueError.
I changed the TextVectorization(max_tokens=5000)
and also Embedding(5000, 16)
, and ran your code.
What I got is shown below:
def model(training, validation):
model = keras.Sequential(
[
layers.Embedding(input_dim=5000, output_dim=16),
layers.Dropout(0.2),
layers.GlobalAveragePooling1D(),
layers.Dropout(0.2),
layers.Dense(1),
]
)
model.compile(
optimizer = keras.optimizers.Adam(),
loss=keras.losses.BinaryCrossentropy(from_logits=True),
metrics=keras.metrics.BinaryAccuracy(threshold=0.0)
)
model.fit(x=training, validation_data=validation, epochs=10)
return model
Output:
Epoch 1/10 437/437 [==============================] - 10s 22ms/step - loss: 0.6797 - binary_accuracy: 0.6455 - val_loss: 0.6539 - val_binary_accuracy: 0.7554
Epoch 2/10 437/437 [==============================] - 10s 22ms/step - loss: 0.6109 - binary_accuracy: 0.7625 - val_loss: 0.5700 - val_binary_accuracy: 0.7880
Epoch 3/10 437/437 [==============================] - 9s 22ms/step - loss: 0.5263 - binary_accuracy: 0.8098 - val_loss: 0.4931 - val_binary_accuracy: 0.8233
Epoch 4/10 437/437 [==============================] - 10s 22ms/step - loss: 0.4580 - binary_accuracy: 0.8368 - val_loss: 0.4373 - val_binary_accuracy: 0.8448
Epoch 5/10 437/437 [==============================] - 10s 22ms/step - loss: 0.4072 - binary_accuracy: 0.8560 - val_loss: 0.4003 - val_binary_accuracy: 0.8522
Epoch 6/10 437/437 [==============================] - 10s 22ms/step - loss: 0.3717 - binary_accuracy: 0.8641 - val_loss: 0.3733 - val_binary_accuracy: 0.8589
Epoch 7/10 437/437 [==============================] - 10s 22ms/step - loss: 0.3451 - binary_accuracy: 0.8728 - val_loss: 0.3528 - val_binary_accuracy: 0.8582
Epoch 8/10 437/437 [==============================] - 9s 22ms/step - loss: 0.3220 - binary_accuracy: 0.8806 - val_loss: 0.3345 - val_binary_accuracy: 0.8673
Epoch 9/10 437/437 [==============================] - 9s 22ms/step - loss: 0.3048 - binary_accuracy: 0.8868 - val_loss: 0.3287 - val_binary_accuracy: 0.8673
Epoch 10/10 437/437 [==============================] - 10s 22ms/step - loss: 0.2891 - binary_accuracy: 0.8929 - val_loss: 0.3222 - val_binary_accuracy: 0.8679
Upvotes: 1
Reputation: 5079
BinaryCrossentropy
is imported from tf.keras.metrics
hence gradients could not be computed.
Correct import should have been from tensorflow.keras.losses import BinaryCrossentropy
.
Upvotes: 1