CODEWITHSUNDEEP

Gabriel Feldman

Gabriel Feldman

Reputation: 43

TensorFlow cross-entropy on tutorial

I was just going through the TensorFlow tutorial (https://www.tensorflow.org/versions/r0.8/tutorials/mnist/pros/index.html#deep-mnist-for-experts).

I have two questions about it:

Why does it use the cost function with y_ * log(y)? Shouldn't it be y_ * log(y) + (1-y_) * log(1-y)?
How does TensorFlow know how to calculate the gradient of the cost function I use? Shouldn't we have a place somewhere to tell TensorFlow how to calculate the gradient?

Thanks!

Upvotes: 4

Views: 1379

Answers (1)

Reputation: 8536

When y = 1 or 0, you can use y_ * log(y) + (1-y_) * log(1-y), but when y is one-hot encoding, y=[0 1] or [1 0], we use y_ * log(y). In fact, they are the same.
Everything is a graph in TensorFlow including your cost function.

So each node knows their operation and local gradient. Tensorflow uses backpropagation (chain rule) to compute the gradient using the graph.

Upvotes: 5

Related Questions