Reputation: 11
I'm trying to implement a custom loss function in keras for a problem of "Partial label learning". In my training set- each training instance is assigned a set of two candidate labels, only one of which is correct. For that purpose, I want to use loss function that during training- will calculate the loss for each label, and choose the loss with minimum value. Simplified version of this function will be something like this:
def custom_loss(y_true, y_pred):
num_labels = tf.reduce_sum(y_true) # [0,1,0,0,1]
if num_labels > 1: #create 2 seperate vectors
y_true_1 = ? # [0,1,0,0,0]
y_true_2 = ? # [0,0,0,0,1]
loss_1 = K.categorical_crossentropy(y_true_1, y_pred)
loss_2 = K.categorical_crossentropy(y_true_2, y_pred)
loss = minimum(loss_1, loss_2)
else:
loss = K.categorical_crossentropy(y_true, y_pred)
return loss
I tried to do it like so:
y_true = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 0.])
y_pred = tf.constant([.9, .05, .05, .5, .89, .6, .05, .01, .94])
def custom_loss(y_true, y_pred):
def train_loss():
y_train_copy = tf.Variable(0, dtype=y_true.dtype)
y_train_copy = tf.assign(y_train_copy, y_true, validate_shape=False)
label_cls = tf.where(tf.equal(y_true,1))
raplace = tf.Variable([0.]) #Variable
y_true_1 = tf.compat.v1.scatter_nd_update(y_train_copy, [label_cls[0]], raplace) # [0,1,0,0,0]
y_true_2 = tf.compat.v1.scatter_nd_update(y_train_copy, [label_cls[1]], raplace) # [0,0,0,0,1]
loss_1 = K.categorical_crossentropy(y_true_1, y_pred)
loss_2 = K.categorical_crossentropy(y_true_2, y_pred)
min_loss = tf.minimum(loss_1, loss_2)
return min_loss
num_labels = tf.reduce_sum(y_true) # [0,1,0,0,1]
loss = tf.cond(num_labels > 1,
lambda: train_loss(),
lambda: K.categorical_crossentropy(y_true, y_pred)) #
return loss
loss = custom_loss(y_true, y_pred)
with tf.Session() as sess:
tf.global_variables_initializer().run()
print(sess.run(loss))
The problem is that for some reason, no matter how I try to get the minimum out of the two losses, I get 0.0, even when loss_1 and loss_2 is definitely not 0
Any idea why? or a better idea for the implementation of this function?
Upvotes: 1
Views: 864
Reputation: 4475
There is no need to create y_train_copy
variable. I simplify your code and the output is the min(loss_1, loss_2).
y_true = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 0.])
y_pred = tf.constant([.9, .05, .05, .5, .89, .6, .05, .01, .94])
def custom_loss(y_true, y_pred):
def train_loss():
label_cls = tf.where(tf.equal(y_true, 1.))
y_true_1 = tf.squeeze(tf.one_hot(label_cls[0], tf.size(y_true)), axis=0)
y_true_2 = tf.squeeze(tf.one_hot(label_cls[1], tf.size(y_true)), axis=0)
loss_1 = K.categorical_crossentropy(y_true_1, y_pred)
loss_2 = K.categorical_crossentropy(y_true_2, y_pred)
min_loss = tf.minimum(loss_1, loss_2)
return min_loss
num_labels = tf.reduce_sum(y_true)
loss = tf.cond(num_labels > 1,
lambda: train_loss(),
lambda: K.categorical_crossentropy(y_true, y_pred)) #
return loss
loss = custom_loss(y_true, y_pred)
with tf.Session() as sess:
print(sess.run(loss))
The bug of your code is using tf.scatter_nd_update()
, it will change the value of y_train_copy
inplace. If you run min_loss
, it will execute y_true_1
and y_true_2
together. The y_true_2
will be always zeors. Then your min_loss
is always zero. If you run loss_2
alone, you can see the loss_2
is not zero, because you didn't execute y_true_1
.
A better choice would be tf.scatter_nd
. You can do like this,
y_true = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 0.])
y_pred = tf.constant([.9, .05, .05, .5, .89, .6, .05, .01, .94])
label_cls = tf.where(tf.equal(y_true, 1.))
idx1, idx2 = tf.split(label_cls,2)
raplace = tf.constant([1.])
y_true_1 = tf.scatter_nd(tf.cast(idx1, dtype=tf.int32), raplace, [tf.size(y_true)])
y_true_2 = tf.scatter_nd(tf.cast(idx2, dtype=tf.int32), raplace, [tf.size(y_true)])
loss_1 = K.categorical_crossentropy(y_true_1, y_pred)
loss_2 = K.categorical_crossentropy(y_true_2, y_pred)
min_loss = tf.minimum(loss_1, loss_2)
with tf.Session() as sess:
print(sess.run(min_loss))
Upvotes: 1