Yorian
Yorian

Reputation: 2062

Neural network: weights and biases convergence

I've been reading up on a few topics regarding machine learning, neural networks and deep learning, one of which is this (in my opinion) excellent online book: http://neuralnetworksanddeeplearning.com/chap1.html

For the most part I've come to understand the workings of a neural network but there is one question which still bugs me (which is based on the example on the website): I consider a three layer neural network with an input layer, hidden layer and output layer. Say these layers have 2, 3 and 1 neurons (although the amount doesn't really matter).

Now an input is given: x1 and x2. Because the network is [2, 3, 1] the weights are randomly generated the first time being a list containing a 2x3 and a 3x1 matrix. The biases is a list of a 3x1 and 1x1 matrix.

Now the part I don't get: The formula calculated in in the hidden layer:

weights x input - biases = 0

On every iteration the weights and biases are changed slightly, based on the derivative in order to find a global optimum. If this is the cases, why don't the biases and weights for every neuron converge to the same weights and biases?

Upvotes: 0

Views: 486

Answers (1)

Yorian
Yorian

Reputation: 2062

I think I found the answer by doing some tests as well as finding some information on the internet. The answer lies in the having random initial weigths and biases. If all "neurons" would be equal they would all come to the same result since the weights, biases and inputs are equal. Having random weights allows for different answers:

x1 = 1
x2 = 2
x3 = 3

w1 = [0, 0, 1], giving w dot x = 3
w2 = [3, 0, 0], giving w dot x = 3

If anyone can confirm, please do so.

Upvotes: 0

Related Questions