Reputation: 1810
I'm trying to use the ModelCheckpoint
callback in keras. However, it keeps saying to me that val_loss
is not available. I added a print statement in the code of ModelCheckpoint
to check the content of the logs
input. You can indeed see that val_loss
is not present in the dictionary.
The weird thing is that val_loss
is correctly reported at the end of each epoch and it is present in the history
object generated by model.fit
. Clearly I provide validation data (otherwise val_loss
could not be evaluated at the end of each epoch).
...
3/3 - 65s - loss: 0.2053 - **val_loss: 0.1153**
Epoch 2/45
logs={'batch': 0, 'size': 30000, 'loss': 0.20355584}
WARNING:tensorflow:Can save best model only with val_loss available, skipping.
...
Is this a bug or am I missing something?
I'm using Keras version '2.2.4-tf' (called from tf.keras
)
Upvotes: 3
Views: 809
Reputation:
Adding the solution here, even though it is present in Github
, for the benefit of the StackOverflow Community
.
The issue was caused by some confusion between keras.callbacks.ModelCheckpoint
and tensorflow.keras.callbacks.ModelCheckpoint
.
In the first (pure keras), the argument period
controls every how many epochs the model is saved. This occurs always at epoch end when also the val_loss
is computed and included in logs
.
In tensorflow.keras.callbacks.ModelCheckpoint
, instead, save_freq
controls every how many batches the model is saved. This cause the callback
to be evaluated in the middle of an epoch, where val_loss
is not available.
Changing save_freq
to epoch
(the default) has resolved the issue.
Upvotes: 1