Reputation: 23
I tried implementing a simple neural network using both sigmoid and relu functions . with the sigmoid function I got some good outputs . but when using relu I got either 0's or 1's array. (I need the relu function beacause I'm willing to use the code for some outputs>1).
def relu(x):
return np.maximum(0,x)
def reluDerivative(x):
x[x<=0] = 0
x[x>0] = 1
return x
training_inputs = np.array([[9, 0 , 1],
[7, 1, 1],
[8, 0, 1],
[5, 1, 1]
])
training_outputs = np.array([[9, 7, 8, 5]]).T
np.random.seed(1)
synaptic_weights = 2 * np.random.random((3,1)) - 1
for iteration in range(100000):
outputs = relu(np.dot(training_inputs, synaptic_weights))
error = training_outputs - outputs
adjustments = error * reluDerivative(outputs)
synaptic_weights += np.dot(training_inputs.T, adjustments )
print("output after training: \n" , outputs)
Upvotes: 0
Views: 1477
Reputation: 63
Update:
(Thanks for including the relu and reluDerivative methods)
The error is indeed in reluDerivative(x)
method.
When you do x[x<=0] = 0
you are modifying the given numpy array. The argument x
is not a clone / deep copy of outputs
, it is exactly the same numpy array. So when you modify x
, you also modify outputs
.
I hope you can figure out why this causes the bug - but let me know if you would like a further explanation.
Update 2
It looks like the code has more issues than the one above, and these are a bit trickier:
If you step through the code using a debugger, you'll notice that, unfortunately with the current random seed (1), the synaptic weights are initialized such that all your training examples produce a negative dot product, which the ReLU then sets to zero. The gradient of zero is zero and this is one of the risks of using ReLU. How to mitigate this?
Once you solve these above problems, you will still notice another problem. The errors and gradients will blow up within a few iterations. This is because you're not yet using a "learning rate" parameter to constrain the rate at which the weights are updated. Read up on how to use a learning rate (or alpha) parameter.
Good luck!
Upvotes: 1