Machine learning using ReLu return NaN

Question

I am trying to figure out this whole machine learning thing, so I was making some testing. I wanted to make it learn the sinus function (with a radian angle). The neural network is:

1 Input (radian angle) / 2 hidden layer / 1 output (prediction of the sinus)

For the squash activation I am using: RELU and it's important to note that when I was using the Logistic function instead of RELU the script was working.

So to do that, I've made a loop that start at 0 and finish at 180, and it will translate the number in radian (radian = loop_index*Math.PI/180) then it'll basically do the sinus of this radian angle and store the radian and the sinus result.

So my table look like this for an entry: {input:[RADIAN ANGLE], output:[sin(radian)]}

for(var i = 0; i <= 180; i++) {
    radian = (i*(Math.PI / 180));
    train_table.push({input:[radian],output:[Math.sin(radian)]})
}

I use this table to train my Neural Network using Cross Entropy and a learning rate of 0.3 with 20000 iterations.

The problem is that it fail, when I try to predict anything it returns "NaN"

I am using the framework Synaptic (https://github.com/cazala/synaptic) and here is a JSfiddle of my code: https://jsfiddle.net/my7xe9ks/2/

Dr. Snoopy · Accepted Answer

A learning rate must be carefully tuned, this parameter matters a lot, specially when the gradients explode and you get a nan. When this happens, you have to reduce the learning rate, usually by a factor of 10.

In your specific case, the learning rate is too high, if you use 0.05 or 0.01 the network now trains and works properly.

Also another important detail is that you are using cross-entropy as a loss, this loss is used for classification, and you have a regression problem. You should prefer a mean squared error loss instead.

Machine learning using ReLu return NaN

Answers (1)

Related Questions