Gabriel Feldman
Gabriel Feldman

Reputation: 43

TensorFlow cross-entropy on tutorial

I was just going through the TensorFlow tutorial (https://www.tensorflow.org/versions/r0.8/tutorials/mnist/pros/index.html#deep-mnist-for-experts).

I have two questions about it:

  1. Why does it use the cost function with y_ * log(y)? Shouldn't it be y_ * log(y) + (1-y_) * log(1-y)?

  2. How does TensorFlow know how to calculate the gradient of the cost function I use? Shouldn't we have a place somewhere to tell TensorFlow how to calculate the gradient?

Thanks!

Upvotes: 4

Views: 1379

Answers (1)

Sung Kim
Sung Kim

Reputation: 8536

  1. When y = 1 or 0, you can use y_ * log(y) + (1-y_) * log(1-y), but when y is one-hot encoding, y=[0 1] or [1 0], we use y_ * log(y). In fact, they are the same.

  2. Everything is a graph in TensorFlow including your cost function.

enter image description here

So each node knows their operation and local gradient. Tensorflow uses backpropagation (chain rule) to compute the gradient using the graph.

Upvotes: 5

Related Questions