Reputation: 1905

Loss on masked tensors

Suppose I have logits like

[[4.3, -0.5, -2.7, 0, 0],
[0.5, 2.3, 0, 0, 0]]

where clearly the last two in the first example and last three in the second example are masked (that is, they are zero) and should not affect loss and gradient computations.

How do I compute cross-entropy loss between this logits and corresponding labels? For sanity, the labels for this example can be something like

[[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0]]

(One issue: Softmax, followed by log, on the logits will be applicable for the masked zeroes too and tf's cross-entropy method will consider the loss for those elements too.)

(Also, you can think about the problem like this: I have logits of varying lengths in a batch, i.e. my logits were length 3 and 2 for eg.1 and eg.2 respectively. Same is followed by the labels.)

Upvotes: 11

Answers (4)

Carlos S Traynor

Reputation: 141

With TF2 one may be able to use indexing in this case. That is what I end up doing.

mask = logits != 0 
weights = tf.cast(mask, tf.float32) 
labels = tf.cast(labels, tf.float32)
logits = tf.cast(logits, tf.float32)
tf.compat.v1.losses.softmax_cross_entropy(labels[mask], logits[mask])

Upvotes: 0

P-Gn

Reputation: 24591

Masking the cross-entropy loss is a common operation, covered by the library. It actually handles the more general concept of weights; Provide binary weights for masking.

mask = tf.equal(logits, 0) # as in the OP
weights = tf.to_float(mask) # convert to (0, 1) weights
loss = tf.losses.softmax_cross_entropy(labels, logits, weights)

Don't compute softmax cross entropy by actually computing the softmax of the ouput then the cross-entropy, you loose the computational precision and stability of doing it simultaneously.

Upvotes: 6

PlanetGazer8360

Reputation: 11

You can do:

import tensorflow as tf
logits = [[4.3, -0.5, -2.7, 0, 0], [0.5, 2.3, 0, 0, 0]]
labels = [[1, 0, 0, 0, 0], [0, 1, 0, 0, 0]]
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=labels))

Upvotes: 0

Ohad Rubin

Reputation: 480

What I ended up doing was the following:

import tensorflow as tf
import numpy as np
prelim_scores=tf.constant([[4.3, -0.5, -2.7, 0, 0],[0.5, 2.3, 0, 0, 0]])
mask=tf.constant([[True,True,True,False,False],[True,True,False,False,False]])
dummy_scores = tf.ones_like(prelim_scores) * -99999.0  # the base matrix to choose from if dummy relation
scores = tf.where(mask, prelim_scores,dummy_scores)  # [B, MAX_NUM_ACTIONS]
a=tf.nn.softmax(scores)
with tf.Session() as sess:
   print(sess.run(a))

result is:

[[9.9094123e-01 8.1551941e-03 9.0362143e-04 0 0]

[1.4185105e-01 8.5814887e-01 0 0 0]]

credit goes to: here

Upvotes: 2

Loss on masked tensors

Answers (4)

Related Questions