Reputation: 73
I have implemented a custom binary cross entropy loss function in tensorflow. To test this I had compared it with the inbuilt binary cross entropy loss function in Tensorflow. But, I got very different results in both cases. I am unable to understand this behaviour.
def custom_loss(eps,w1,w2):
def loss(y_true, y_pred):
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(y_pred+eps))
return ans
return loss
I had set eps to 1e-6, w1=1 and w2=1. The loss dropped to very small values when I used my implementation of the loss function. Whereas, there was a steady drop while using the inbuilt loss function in tensorflow.
Edit: Here are the outputs:
1: Using the custom implementation:
1/650 [..............................] - ETA: 46:37 - loss: 0.8810 - acc: 0.50
2/650 [..............................] - ETA: 41:27 - loss: 0.4405 - acc: 0.40
3/650 [..............................] - ETA: 39:38 - loss: 0.2937 - acc: 0.41
4/650 [..............................] - ETA: 38:44 - loss: 0.2203 - acc: 0.45
5/650 [..............................] - ETA: 38:13 - loss: 0.1762 - acc: 0.46
6/650 [..............................] - ETA: 37:47 - loss: 0.1468 - acc: 0.42
7/650 [..............................] - ETA: 37:29 - loss: 0.1259 - acc: 0
1/650 [..............................] - ETA: 48:15 - loss: 2.4260 - acc: 0.31
2/650 [..............................] - ETA: 42:09 - loss: 3.1842 - acc: 0.46
3/650 [..............................] - ETA: 40:10 - loss: 3.4615 - acc: 0.47
4/650 [..............................] - ETA: 39:06 - loss: 3.9737 - acc: 0.45
5/650 [..............................] - ETA: 38:28 - loss: 4.5173 - acc: 0.47
6/650 [..............................] - ETA: 37:58 - loss: 5.1865 - acc: 0.45
7/650 [..............................] - ETA: 37:41 - loss: 5.8239 - acc: 0.43
8/650 [..............................] - ETA: 37:24 - loss: 5.6979 - acc: 0.46
9/650 [..............................] - ETA: 37:12 - loss: 5.5973 - acc: 0.47
The input is an image from the MURA dataset. To keep the test uniform same images are passed in both the tests.
Upvotes: 3
Views: 184
Reputation: 14525
You have a slight error in your implementation.
You have:
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(y_pred + eps))
Whereas, I think you were aiming for:
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(1 - y_pred + eps))
Generally we also take the average of this loss so that makes our implementation:
def custom_loss(eps,w1,w2):
def loss(y_true, y_pred):
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(1-y_pred+eps))
return tf.reduce_mean(ans)
return loss
which we can now test against the out of the box implementation:
y_true = tf.constant([0.1, 0.2])
y_pred = tf.constant([0.11, 0.19])
custom_loss(y_true, y_pred) # == 0.41316
tf.keras.losses.binary_crossentropy(y_true, y_pred) # == 0.41317
and find that the results match to many decimal places (I can't account for the small difference - maybe a different epsilon value? - but I guess such a small difference is negligible)
Upvotes: 3