Reputation: 2160
So I define this custom loss function in Keras using a Tensorflow backend to minimize a background extraction autoencoder. It's supposed to ensure that the prediction x_hat doesn't stray to far from the median of the predictions taken over the batch B0.
def ben_loss(x, x_hat):
B0 = tf_median(tf.transpose(x_hat))
sigma = tf.reduce_mean(tf.sqrt(tf.abs(x_hat - B0) / 0.4), axis=0)
# I divide by sigma in the next step. So I add a small float32 to sigma
# so as to prevent background_term from becoming a nan.
sigma += 1e-22
background_term = tf.reduce_mean(tf.abs(x_hat - B0) / sigma, axis=-1)
bce = binary_crossentropy(x, x_hat)
loss = bce + background_term
return loss
When I try the minimize the network using this loss function the loss almost immediately becomes a NaN. Does anyone know why this happening? You can reproduce the error by cloning my repo and running this script.
Upvotes: 0
Views: 5008
Reputation: 2160
It was coming from the fact that tf.abs(x_hat - B0) was approaching a tensor with all zeros for entries. This was making the derivative of sigma wrt x_hat a NaN. The solution was to add a small value to that quantity.
def ben_loss(x, x_hat):
B0 = tf_median(tf.transpose(x_hat))
F0 = tf.abs(x_hat - B0) + 1e-10
sigma = tf.reduce_mean(tf.sqrt( / 0.4), axis=0)
background_term = tf.reduce_mean(F0 / sigma, axis=-1)
bce = binary_crossentropy(x, x_hat)
loss = bce + background_term
return loss
Upvotes: 5