Reputation: 1753
I am trying to train a CNN
model in TensorFlow 2.0. It's a multiclass classification task. I am simplifying the code to make it more readable:
# Loss function
loss = tf.keras.metrics.CategoricalCrossentropy()
# Optimizer
optimizer = tf.optimizers.Adam(learning_rate = 0.0005)
# Training:
for epoch in range(1000):
# fetch mini batch of data
X_batch, y_batch = fetch_batch( [...] )
with tf.GradientTape() as tape:
current_loss = loss(y_batch, CNN(X_batch)) # take current loss
# get the gradient of the loss function
gradients = tape.gradient(current_loss, CNN.trainable_variables)
# update weights
optimizer.apply_gradients(zip(gradients, CNN.trainable_variables))
[ ... ]
At this point, I get an error:
ValueError: No gradients provided for any variable ...
I know where the problem is: something goes wrong when I call tape.gradient()
. If I check the object gradient
this is what I get:
print(gradients)
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
I don't understand why gradients
is returned like this. I have literally copy-pasted the code for training other (non-CNN) models in TF 2.0, and they always worked out very well. All the others element of my model seem to behave as they should.
--
PS: this question is different from this one, which is based on TF 1.x.
Upvotes: 4
Views: 501
Reputation: 10474
I think you want tf.keras.losses.CategoricalCrossentropy
as your loss, not the metrics
version. These are actually different functions, not aliases.
Upvotes: 2