Reputation: 29
Am new to tensorflow, can someone explain me how did we get answer as 1.16012561.
unscaled_logits = tf.constant([[1., -3., 10.]])
target_dist = tf.constant([[0.1, 0.02, 0.88]])
softmax_xentropy =
tf.nn.softmax_cross_entropy_with_logits(logits=unscaled_logits,
labels=target_dist)
with tf.Session() as sess:
print(sess.run(softmax_xentropy))
Output: [ 1.16012561]
Upvotes: 2
Views: 2609
Reputation: 59681
Here is a good explanation about it. It works like this. First, the logits are passed through the softmax function, giving you a probability distribution:
import numpy as np
logits = np.array([1., -3., 10.])
# Softmax function
softmax = np.exp(logits) / np.sum(np.exp(logits))
print(softmax)
>>> array([ 1.23394297e-04, 2.26004539e-06, 9.99874346e-01])
# It is a probability distribution because the values are in [0, 1]
# and add up to 1
np.sum(softmax)
>>> 0.99999999999999989 # Almost, that is
Then, you compute the cross-entropy between the computed softmax value and the target.
target = np.array([0.1, 0.02, 0.88])
# Cross-entropy function
crossentropy = -np.sum(target * np.log(softmax))
print(crossentropy)
>>> 1.1601256622376641
tf.nn.softmax_cross_entropy_with_logits
will return you one of these values "per vector" (by default, "vectors" are in the last dimension), so, for example, if your input logits and targets have size 10x3 you will end up with 10 cross-entropy values. Usually one sums or averages these all and uses the result as loss value to minimize (which is what tf.losses.softmax_cross_entropy
offers). The logic behind the cross-entropy expression is that target * np.log(softmax)
will take negative values closer to zero where target
is more similar to softmax
and diverge from zero (towards minus infinity) when they are different.
Note: This is a logical explanation of the function. Internally, TensorFlow most likely perform different but equivalent operations for better performance and numerical stability.
Upvotes: 3