Leevo
Leevo

Reputation: 1753

TensorFlow 2.0: tf.GradientTape() returns None results

I am trying to train a CNN model in TensorFlow 2.0. It's a multiclass classification task. I am simplifying the code to make it more readable:

# Loss function
loss = tf.keras.metrics.CategoricalCrossentropy()

# Optimizer
optimizer = tf.optimizers.Adam(learning_rate = 0.0005)


# Training:

for epoch in range(1000):

    # fetch mini batch of data
    X_batch, y_batch = fetch_batch( [...] )

    with tf.GradientTape() as tape:
        current_loss = loss(y_batch, CNN(X_batch))  # take current loss

    # get the gradient of the loss function
    gradients = tape.gradient(current_loss, CNN.trainable_variables)

    # update weights
    optimizer.apply_gradients(zip(gradients, CNN.trainable_variables))

    [ ... ]

At this point, I get an error:

ValueError: No gradients provided for any variable ...

I know where the problem is: something goes wrong when I call tape.gradient(). If I check the object gradient this is what I get:

print(gradients)

[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]

I don't understand why gradients is returned like this. I have literally copy-pasted the code for training other (non-CNN) models in TF 2.0, and they always worked out very well. All the others element of my model seem to behave as they should.

--

PS: this question is different from this one, which is based on TF 1.x.

Upvotes: 4

Views: 501

Answers (1)

xdurch0
xdurch0

Reputation: 10474

I think you want tf.keras.losses.CategoricalCrossentropy as your loss, not the metrics version. These are actually different functions, not aliases.

Upvotes: 2

Related Questions