Reputation: 41
Maybe someone here can I help me. I am bit stuck. Right now, I am trying to write my own Neural Network in C#. I got it working somewhat (it works with XOR). It is a simple Neural Network with Input, Hidden and Output and I am using the ReLU as my activation function. My Problem is when I increase the amount of Hidden Layers to something bigger than ~16 I tend to get some NaN's or Infinites, which messes everything up pretty fast. I tried decreasing the learning Rate but that doesn't help. I think the problem is somewhere in my SGD function but I can't really find it, especially because it works with fewer layers.
This is the function:
private void SGD(double learningRate, double[] weightedSumHidden, double[] errors_output)
{
/*---------------------------------------------------------------
* -- Calculate Delta of the weight between hidden and output --
---------------------------------------------------------------*/
var HiddenTransposed = Hidden.Transpose();
var deltaWeightOutput = HiddenTransposed.Dot(errors_output);
double[,] deltaWeightOutput2D = Matrix.Create(deltaWeightOutput); //Convert to Matrix
WeightsHiddenOutput = WeightsHiddenOutput.Add(deltaWeightOutput2D.Multiply(learningRate));
/*---------------------------------------------------------------
* -- Calculate Delta of the weight between input and hidden --
---------------------------------------------------------------*/
//First we have to calculate the Error in the hidden nodes ...
//Transposed because we are going Backwards through the Network
var WHOTransposed = WeightsHiddenOutput.Transpose();
//Moves the Error to the output layer
var errors_hidden = WHOTransposed.Dot(errors_output);
//Element Wise multiplication (schur product)
weightedSumHidden = ApplyDerivativeReLU(weightedSumHidden);
//Moves the Error backthrough the Neuron
errors_hidden = errors_hidden.Multiply(weightedSumHidden);
//... then we can Calculate the Delta
var InputTransposed = Inputs.Transpose();
var deltaWeightHidden = InputTransposed.Dot(errors_hidden);
double[,] deltaWeightHidden2D = Matrix.Create(deltaWeightHidden); //Convert to Matrix
deltaWeightHidden2D = Inputs.Transpose().Dot(deltaWeightHidden2D);
/*---------------------------------------------------------------
* -- Adjust Weights and Biases using the delta --
---------------------------------------------------------------*/
//The Biases just get adjusted by adding the Errors multiplied by the learning rate
BiasOutput = BiasOutput.Add(errors_output.Multiply(learningRate)); //Output Bias
BiasHidden = BiasHidden.Add(errors_hidden.Multiply(learningRate)); //Hidden Bias
WeightsInputHidden = WeightsInputHidden.Add(deltaWeightHidden2D.Multiply(learningRate));
}
If anybody could help me on this one I would be really thankful I am stuck on this for days. I used this Guide (http://neuralnetworksanddeeplearning.com/chap2.html) as a basis for my code. Also, I am using Accord.Math for the Matrix Math.
Thanks!
Upvotes: 0
Views: 255
Reputation: 21
You can use these with breaking point to check where error starts:
if (double.IsNan(value))
if (double.IsInfinity(value))
if (float.IsNan(value))
if (float.IsInfinity(value))
I had the same problem (with NaN) and Exceptions had helped me to find problem:
if (double.IsNan(value) || double.IsIninity(value)) throw new Exception();
Visual studio's debug tools are a great help - you can use breaking points to check values in your objects.
Upvotes: 1