Reputation: 55
I am trying to create a simple neural network and stuck at updating the weights at first layer in two layers. I imagine the first update I am doing to w2 are correct as what I learned from back propagation algorithm. I am not including bias for now. But how do we update the first layer weights is where I am stuck at.
import numpy as np
np.random.seed(10)
def sigmoid(x):
return 1.0/(1+ np.exp(-x))
def sigmoid_derivative(x):
return x * (1.0 - x)
def cost_function(output, y):
return (output - y) ** 2
x = 2
y = 4
w1 = np.random.rand()
w2 = np.random.rand()
h = sigmoid(w1 * x)
o = sigmoid(h * w2)
cost_function_output = cost_function(o, y)
prev_w2 = w2
w2 -= 0.5 * 2 * cost_function_output * h * sigmoid_derivative(o) # 0.5 being learning rate
w1 -= 0 # What do you update this to?
print(cost_function_output)
Upvotes: 1
Views: 554
Reputation: 164
I'm not able to comment on your question, so writing here.
Firstly, your sigmoid_derivative function is wrong.
The derivative of sigmoid(x*y) w.r.t x is = sigmoid(x*y)*(1-sigmoid(x*y))*y
.
Edit: (deleted unnecessary text)
We need dW1 and dW2 (These are dJ/dW1
and dJ/dW
(partial derivatives) respectively.
J = (o - y)^2
therefore dJ/do = 2*(o - y)
Now, dW2
dJ/dW2 = dJ/do * do/dW2 (chain rule)
dJ/dW2 = (2*(o - y)) * (o*(1 - o)*h)
dW2 (equals above equation)
W2 -= learning_rate*dW2
Now, for dW1
dJ/dh = dJ/do * do/dh = (2*(o - y)) * (o*(1 - o)*W2
dJ/dW1 = dJ/dh * dh/dW1 = ((2*(o - y)) * (o*(1 - o)*W2)) * (h*(1- h)*x)
dW1 (equals above equation)
W1 -= learning_rate*dW2
PS: Try to make a computational graphs, finding derivatives become a lot more easier. (If you don't know this, read it online)
Upvotes: 3