Gokulakannan
Gokulakannan

Reputation: 128

Neural network cost function implementation

I am implementing neural network to train hand written digits in python. Following is the cost function, enter image description here

In log(1-(h(x)), if h(x) is 1, then it would result in log(1-1), i.e. log(0). So I'm getting math error.

Im initializing the weights randomly between 10-60. I'm not sure where I have to change and what I have to change!

Upvotes: 0

Views: 351

Answers (1)

Maxim
Maxim

Reputation: 53758

In this formula, h(x) is usually a sigmoid: h(x)=sigmoid(x), so it's never exactly 1.0, unless the activations in the network are too large (which is bad and will cause problems anyway). The same problem is possible with log(h(x)) when h(x)=0, i.e., when x is a large negative number.

If you don't want to worry about numerical issues, simply add a small number before computing the log: log(h(x) + 1e-10).

Other issues:

  • Weight initialization in a range [10, 60] doesn't look right, they should better be small random numbers, e.g., from [-0.01, 0.01].
  • The formula above is computing binary cross-entropy loss. If you're working with MNIST, it has 10 classes, so the loss must be multi-class cross-entropy. See this question for details.

Upvotes: 1

Related Questions