Updating weights in general back-propagation

Question

I am building a 2 layer, dense neural network from scratch. I am confused about updating weights between the input layer and the first hidden layer.

z1 = w1 @ inp + b1         # inp is the input vector
z1_act = activation(z1)
z2 = w2 @ z1_act + b2
z2_act = activation(z2)

gradient2 = 0.5 * ((out - z2_act) ** 2) * activation_deriv(z2)   # out is the vector containing actual output
gradient1 = (w2.T @ gradient2) * activation_deriv(z1)

delta_w2 = learning_rate * gradient2 * z1_act
delta_w1 = learning_rate * gradient1

w2 = w2 + delta_w2
w1 = w1 + delta_w1

The code is working as the shapes are correct. But I am not sure if this is the right way to calculate delta_w1. Can anyone help me?

Edit: The structure of the Neural Network is:

sempersmile · Accepted Answer

Almost: Your way to calculate delta_w1 is in principle right, however you want to go towards the minimum, so you're missing a negative-sign in your formula for delta_w1 and delta_w2. With the current implementation, you would not optimize the weights but instead go in the 'wrong direction'.

You might want to have a look at the following link as well: https://stats.stackexchange.com/questions/5363/backpropagation-algorithm

Updating weights in general back-propagation

Answers (1)

Related Questions