man zet
man zet

Reputation: 901

Tensorflow gradient always gives None when using GradientTape

I was playing around and trying to implement my own loss function in TensorFlow but I always get None gradients. To reproduce the problem I've now reduced my program to a minimal example. I define a very simple model:

import tensorflow as tf

model = tf.keras.Sequential(
    [
        tf.keras.Input(shape=(3,), name="input"),
        tf.keras.layers.Dense(64, activation="relu", name="layer2"),
        tf.keras.layers.Dense(3, activation="softmax", name="output"),
    ]
)

and then define a very simple (but probably useless) loss function:

def dummy_loss(x):
  return tf.reduce_sum(x)

def train(model, inputs, learning_rate):
  outputs = model(inputs)
  with tf.GradientTape() as t:
    current_loss = dummy_loss(outputs)
  temp = t.gradient(current_loss, model.trainable_weights)
train(model, tf.random.normal((10, 3)), learning_rate=0.001)

but t.gradient(current_loss, model.trainable_weights) gives me only a list of None values, i.e. [None, None, None, None]. Why is this the case? What am I doing wrong? Might there be a misconception on my side about how TensorFlow works?

Upvotes: 0

Views: 1043

Answers (1)

today
today

Reputation: 33410

You need to run (i.e. forward pass) the computation graph or model within the context of GradientTape so that all the operations in the model could be recorded:

  with tf.GradientTape() as t:
    outputs = model(inputs)  # This line should be within context manager
    current_loss = dummy_loss(outputs)

Upvotes: 1

Related Questions