Reputation: 10158
I wrote a simple neural network to learn an AND gate. I'm trying to understand why my cost never decreases and the predictors are always 0.5:
import numpy as np
import theano
import theano.tensor as T
inputs = [[0,0], [1,1], [0,1], [1,0]]
outputs = [[0], [1], [0], [0]]
x = theano.shared(value=np.asarray(inputs), name='x')
y = theano.shared(value=np.asarray(outputs), name='y')
alpha = 0.1
w_array = np.asarray(np.random.uniform(low=-1, high=1, size=(2, 1)), dtype=theano.config.floatX)
w = theano.shared(value=w_array, name='w', borrow=True)
output = T.nnet.sigmoid(T.dot(x, w))
cost = T.sum((y - output) ** 2)
updates = [(w, w - alpha * T.grad(cost, w))]
train = theano.function(inputs=[], outputs=[], updates=updates)
test = theano.function(inputs=[], outputs=[output])
calc_cost = theano.function(inputs=[], outputs=[cost])
for i in range(60000):
if (i+1) % 10000 == 0:
print(i+1)
print(calc_cost())
train()
print(test())
The output is always the same:
10000
[array(1.0)]
20000
[array(1.0)]
30000
[array(1.0)]
40000
[array(1.0)]
50000
[array(1.0)]
60000
[array(1.0)]
[array([[ 0.5],
[ 0.5],
[ 0.5],
[ 0.5]])]
It always seems to predict 0.5 regardless of the input because the cost is not deviating from 1 during learning
If I switch the outputs to [[0], [1], [1], [1]]
for learning an OR gate, I get the correct predictions, and correctly decreasing cost
Upvotes: 0
Views: 149
Reputation: 66815
Your model is of form
<w, x>
thus it cannot build any separation which does not cross the origin. Such equation can only express lines going through point (0,0), and obviously line separating AND gate ((1, 1) from anything else) does not cross the origin. You have to add bias term, so your model is
<w, x> + b
Upvotes: 1