TensorFlow - predicting next word - loss function logit na target shape

Question

I'm trying to create a language model. I have logit and target of size: [32, 312, 512]

Where:

.shape[0] is batch_size
.shape[1] is sequence_max_len
.shape[2] is vocabulary size

The question is - when I pass logit and target to the loss function as follows:

self.loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(
                                          logits=self.logit, labels=self.y))

Does it compute appropriate loss for the current batch? Or should I reshape logit and target to express the following shape: [32, 312*512]?

Thanks in advance for your help!

Ziemo · Accepted Answer

The answer is: it's irrelevant, since tf.nn.softmax_cross_entropy_with_logits() have dim argument:

dim: The class dimension. Defaulted to -1 which is the last dimension.
name: A name for the operation (optional).

Also inside tf.nn.softmax_cross_entropy_with_logits() you have this code:

# Make precise_logits and labels into matrices.
precise_logits = _flatten_outer_dims(precise_logits)
labels = _flatten_outer_dims(labels)

TensorFlow - predicting next word - loss function logit na target shape

Answers (2)

Related Questions