Reputation: 2337
I am using keras with a custom loss function like below:
def custom_fn(y_true, y_pred):
# changing y_true, y_pred values systematically
return mean_absolute_percentage_error(y_true, y_pred)
Then I am calling model.compile(loss=custom_fn)
and model.fit(X, y,..validation_data=(X_val, y_val)..)
Keras is then saving loss
and val_loss
in model history. As a sanity check, when the model finishes training, I am using model.predict(X_val)
so I can calculate validation loss manually with my custom_fn
using the trained model.
I am saving the model with the best epoch using this callback:
callbacks.append(ModelCheckpoint(path, save_best_only=True, monitor='val_loss', mode='min'))
so after calculating this, the validation loss should match keras' val_loss
value of the best epoch. But this is not happening.
As another attempt to figure this issue out, I am also doing this:
model.compile(loss=custom_fn, metrics=[custom_fn])
And to my surprise, val_loss
and val_custom_fn
do not match (neither loss
or loss_custom_fn
for that matter).
This is really strange, my custom_fn
is essentially keras' built in mape
with the y_true
and y_pred
slightly manipulated. what is going on here?
PS: the layers I am using are LSTM
layers and a final Dense
layer. But I think this information is not relevant to the problem. I am also using regularisation as hyperparameter but not dropout.
Even removing custom_fn
and using keras' built in mape
as a loss function and metric like so:
model.compile(loss='mape', metrics=['mape'])
and for simplicity, removing ModelCheckpoint
callback is having the same effect; val_loss
and val_mape
for each epoch are not equivalent. This is extremely strange to me. I am either missing something or there is a bug in Keras code..the former might be more realistic.
Upvotes: 7
Views: 2069
Reputation: 2337
This blog post suggests that keras adds any regularisation used in the training when calculating the validation loss. And obviously, when calculating the metric of choice no regularisation is applied. This is why it occurs with any loss function of choice as stated in the question.
This is something I could not find any documentation on from Keras. However, it seems to hold up since when I remove all regularisation hyperparameters, the val_loss
and val_custom_fn
match exactly in each epoch.
An easy workaround is to either use the custom_fn
as a metric and save the best model based on the metric (val_custom_fn
) than on the val_loss
. Or else Loop through each epoch manually and calculate the correct val_loss
manually after training each epoch. The latter seems to make more sense since there is no reason to include custom_fn
both as a metric and as a loss function.
If anyone can find any evidence of this in the Keras documentation that would be helpful.
Upvotes: 5