Reputation: 3477
I was trying to program the perceptron learning rule for the case of an AND example. Graphically we will have:
where the value of x0=1, the algorithm for updating the weights is:
and I have made the following program in Python:
import math
def main():
theta=[-0.8,0.5,0.5]
learnrate=0.1
target=[0,0,0,1]
output=[0,0,0,0]
x=[[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
for i in range(0,len(x)):
output[i]=evaluate(theta,x[i])
for j in range(0,100):
update(theta,x,learnrate,target,output)
def evaluate(theta,x):
r=theta[0]*x[0]+theta[1]*x[1]+theta[2]*x[2]
r=1/(1+math.exp(-r))
return r
def update(theta,x,n,target,output):
for i in range(0,len(x)):
for j in range(0,len(x[i])):
delta=n*(target[i]-output[i])*x[i][j]
theta[j]=theta[j]+delta
print theta
r=evaluate(theta,x[i])
print r
print "\n"
if __name__=="__main__":
main()
The problem occurs when I run the program, for the first set of theta values:
theta=[-0.8,0.5,0.5]
I got the values:
[-7.869649929246505, 0.7436243430418894, 0.7436243430418894]
0.000382022127989
[-7.912205677565339, 0.7436243430418894, 0.7010685947230553]
0.000737772440166
[-7.954761425884173, 0.7010685947230553, 0.7010685947230553]
0.000707056388635
[-7.90974482561542, 0.7460851949918075, 0.7460851949918075]
0.00162995036457
the bracket terms are the updated theta values, while the other values are the results of the evaluation. In this case my results should be very close to 1 for the last case and close to 0 for the other, but this is not happening.
When I use this values:
theta=[-30,20,20]
they neatly approach to one the last data set, and 0 for the others:
[-30.00044943890137, 20.0, 20.0]
9.35341823401e-14
[-30.000453978688242, 20.0, 19.99999546021313]
4.53770586567e-05
[-30.000458518475114, 19.99999546021313, 19.99999546021313]
4.53768526644e-05
[-30.000453978688242, 20.0, 20.0]
0.999954581518
and even when I try with another set:
theta=[-5,20,20]
my results are not as good as the previous ones:
[-24.86692245237865, 10.100003028432075, 10.100003028432075]
1.5864734081e-11
[-24.966922421788425, 10.100003028432075, 10.000003059022298]
3.16190904073e-07
[-25.0669223911982, 10.000003059022298, 10.000003059022298]
2.86101378609e-07
[-25.0669223911982, 10.000003059022298, 10.000003059022298]
0.00626235903
am I missing some part or is there something wrong in this implementation? I know that there is another algorithm that uses derivatives, but I would like to implement this naive case.
Thanks
Upvotes: 2
Views: 2049
Reputation: 33509
The problem is that you are not recomputing the output after the weights change so the error signal remains constant and the weights will change in the same way on every iteration.
Change the code as follows:
def update(theta,x,n,target,output):
for i in range(0,len(x)):
output[i] = evaluate(theta,x[i]) # This line is added
for j in range(0,len(x[i])):
delta=n*(target[i]-output[i])*x[i][j]
theta[j]=theta[j]+delta
print theta
r=evaluate(theta,x[i])
print r
print "\n"
and you should find it converges much better.
Upvotes: 1