Reputation: 901
I was playing around and trying to implement my own loss function in TensorFlow but I always get None
gradients. To reproduce the problem I've now reduced my program to a minimal example. I define a very simple model:
import tensorflow as tf
model = tf.keras.Sequential(
[
tf.keras.Input(shape=(3,), name="input"),
tf.keras.layers.Dense(64, activation="relu", name="layer2"),
tf.keras.layers.Dense(3, activation="softmax", name="output"),
]
)
and then define a very simple (but probably useless) loss function:
def dummy_loss(x):
return tf.reduce_sum(x)
def train(model, inputs, learning_rate):
outputs = model(inputs)
with tf.GradientTape() as t:
current_loss = dummy_loss(outputs)
temp = t.gradient(current_loss, model.trainable_weights)
train(model, tf.random.normal((10, 3)), learning_rate=0.001)
but t.gradient(current_loss, model.trainable_weights)
gives me only a list of None
values, i.e. [None, None, None, None]
. Why is this the case? What am I doing wrong? Might there be a misconception on my side about how TensorFlow works?
Upvotes: 0
Views: 1043
Reputation: 33410
You need to run (i.e. forward pass) the computation graph or model within the context of GradientTape
so that all the operations in the model could be recorded:
with tf.GradientTape() as t:
outputs = model(inputs) # This line should be within context manager
current_loss = dummy_loss(outputs)
Upvotes: 1