SelketDaly
SelketDaly

Reputation: 587

"No gradients provided for any variable" when trying to fit Keras Sequential

I'm trying to create and train a Sequential model like so:

def model(training: Dataset, validation: Dataset):
    model = Sequential(layers=[Embedding(input_dim=1001, output_dim=16), Dropout(0.2), GlobalAveragePooling1D(), Dropout(0.2), Dense(1)])
    model.compile(loss=BinaryCrossentropy(from_logits=True), optimizer='adam', metrics=BinaryAccuracy(threshold=0.0))
    model.fit(x=training, validation_data=validation, epochs=10)

When I run it, I get the following error the model.fit line:

ValueError: No gradients provided for any variable: ['embedding/embeddings:0', 'dense/kernel:0', 'dense/bias:0'].

I've come across some answers talking about the use of optimizers, but how would that apply to Sequential rather than Model? Is there something else that I'm missing?

Edit: The result of print(training):

<MapDataset shapes: ((None, 250), (None,)), types: (tf.int64, tf.int32)>

Edit: A script that will reproduce the error using IMDB sample data

from tensorflow.keras import Sequential
from tensorflow import data
from keras.layers import TextVectorization
import tensorflow as tf
from tensorflow.keras.layers import Embedding, Dropout, GlobalAveragePooling1D, Dense
from tensorflow.keras.metrics import BinaryAccuracy, BinaryCrossentropy
import os


def split_dataset(dataset: data.Dataset):
    record_count = len(list(dataset))
    training_count = int((70 / 100) * record_count)
    validation_count = int((15 / 100) * record_count)

    raw_train_ds = dataset.take(training_count)
    raw_val_ds = dataset.skip(training_count).take(validation_count)
    raw_test_ds = dataset.skip(training_count + validation_count)

    return {"train": raw_train_ds, "test": raw_test_ds, "validate": raw_val_ds}


def clean(text, label):
    return tf.strings.unicode_transcode(text, "US ASCII", "UTF-8")


def vectorize_dataset(dataset: data.Dataset):
    return dataset.map(vectorize_text)


def vectorize_text(text, label):
    text = tf.expand_dims(text, -1)
    return vectorize_layer(text), label


url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"
dataset_tar = tf.keras.utils.get_file("aclImdb_v1", url,
                                    untar=True, cache_dir='.',
                                    cache_subdir='')
dataset_dir = os.path.join(os.path.dirname(dataset_tar), 'aclImdb')

batch_size = 32
seed = 42
dataset = tf.keras.preprocessing.text_dataset_from_directory(
    'aclImdb/train',
    batch_size=batch_size,
    validation_split=0.2,
    subset='training',
    seed=seed)

split_data = split_dataset(dataset)
raw_train = split_data['train']
raw_val = split_data['validate']
raw_test = split_data['test']

vectorize_layer = TextVectorization(max_tokens=10000, output_mode="int", output_sequence_length=250, ngrams=1)
cleaned_text = raw_train.map(clean)
vectorize_layer.adapt(cleaned_text)

train = vectorize_dataset(raw_train)
test = vectorize_dataset(raw_test)
validate = vectorize_dataset(raw_val)


def model(training, validation):
    sequential_model = Sequential(
        layers=[Embedding(input_dim=1001, output_dim=16), Dropout(0.2), GlobalAveragePooling1D(), Dropout(0.2),
                Dense(1)])
    sequential_model.compile(loss=BinaryCrossentropy(from_logits=True), optimizer='adam', metrics=BinaryAccuracy(threshold=0.0))
    sequential_model.fit(x=training, validation_data=validation, epochs=10)


model(train, validate)

Upvotes: 0

Views: 869

Answers (2)

Priya
Priya

Reputation: 743

The problem in your code is occurring at below line:

vectorize_layer = TextVectorization(max_tokens=10000, output_mode="int", output_sequence_length=250, ngrams=1)

The max_tokens in the TextVectorization layer corresponds to the total number of unique words in the vocabulary.

Embedding Layer: The Embedding layer can be understood as a lookup table that maps from integer indices (which stand for specific words) to dense vectors (their embeddings) .

In your code, the Embedding dimensions are (1001,16) that means you are only accomodating the integers that map the specific words in a range of 1001, any indices that forms a (row, column) pair, which corresponds to a value greater than 1001 are not taken care off. Therefore, the ValueError.

I changed the TextVectorization(max_tokens=5000) and also Embedding(5000, 16), and ran your code.

What I got is shown below:

def model(training, validation):
   model = keras.Sequential(
    [
     layers.Embedding(input_dim=5000, output_dim=16),
     layers.Dropout(0.2),
     layers.GlobalAveragePooling1D(),
     layers.Dropout(0.2),
     layers.Dense(1),
     ]
     )
   model.compile(
    optimizer = keras.optimizers.Adam(),
    loss=keras.losses.BinaryCrossentropy(from_logits=True),
    metrics=keras.metrics.BinaryAccuracy(threshold=0.0)
              )
model.fit(x=training, validation_data=validation, epochs=10)
return model

Output:
Epoch 1/10 437/437 [==============================] - 10s 22ms/step - loss: 0.6797 - binary_accuracy: 0.6455 - val_loss: 0.6539 - val_binary_accuracy: 0.7554
Epoch 2/10 437/437 [==============================] - 10s 22ms/step - loss: 0.6109 - binary_accuracy: 0.7625 - val_loss: 0.5700 - val_binary_accuracy: 0.7880
Epoch 3/10 437/437 [==============================] - 9s 22ms/step - loss: 0.5263 - binary_accuracy: 0.8098 - val_loss: 0.4931 - val_binary_accuracy: 0.8233
Epoch 4/10 437/437 [==============================] - 10s 22ms/step - loss: 0.4580 - binary_accuracy: 0.8368 - val_loss: 0.4373 - val_binary_accuracy: 0.8448
Epoch 5/10 437/437 [==============================] - 10s 22ms/step - loss: 0.4072 - binary_accuracy: 0.8560 - val_loss: 0.4003 - val_binary_accuracy: 0.8522
Epoch 6/10 437/437 [==============================] - 10s 22ms/step - loss: 0.3717 - binary_accuracy: 0.8641 - val_loss: 0.3733 - val_binary_accuracy: 0.8589
Epoch 7/10 437/437 [==============================] - 10s 22ms/step - loss: 0.3451 - binary_accuracy: 0.8728 - val_loss: 0.3528 - val_binary_accuracy: 0.8582
Epoch 8/10 437/437 [==============================] - 9s 22ms/step - loss: 0.3220 - binary_accuracy: 0.8806 - val_loss: 0.3345 - val_binary_accuracy: 0.8673
Epoch 9/10 437/437 [==============================] - 9s 22ms/step - loss: 0.3048 - binary_accuracy: 0.8868 - val_loss: 0.3287 - val_binary_accuracy: 0.8673
Epoch 10/10 437/437 [==============================] - 10s 22ms/step - loss: 0.2891 - binary_accuracy: 0.8929 - val_loss: 0.3222 - val_binary_accuracy: 0.8679

Upvotes: 1

Frightera
Frightera

Reputation: 5079

BinaryCrossentropy is imported from tf.keras.metrics hence gradients could not be computed.

Correct import should have been from tensorflow.keras.losses import BinaryCrossentropy.

Upvotes: 1

Related Questions