Reputation: 653
I am trying to train a 3 input, 1 output neural network (with an input layer, one hidden layer and an output layer) that can classify quadratics in MATLAB. I am attempting to implement phases for feed-forward, $x_i^{out}=f(s_i)$, $s_i={\sum}_{\substack{j\\}} w_{ij}x_j^{in}$ back-propagation ${\delta}_j^{in}=f'(s_i){\sum}_{\substack{j\\}} {\delta}_i^{out}w_{ij}$ and updating $w_{ij}^{new}=w_{ij}^{old}-\epsilon {\delta}_i^{out}x_j^{in}$, where $x$ is an input vector, $w$ is weight and $\epsilon$ is a learning rate.
I have troubles coding the hidden layer and adding the activation function $f(s)=tanh(s)$ since the error in the output of the network doesn't seem to decrease. Can someone point out what I am implementing wrong?
The inputs are the real coeffcients of the quadratic $ax^2 + bx + c = 0$ and the output should be positive if the quadratic has two real roots and negative if it doesn't.
nTrain = 100; % training set
nOutput = 1;
nSecondLayer = 7; % size of hidden layer (arbitrary)
trainExamples = rand(4,nTrain); % independent random set of examples
trainExamples(4,:) = ones(1,nTrain); % set the dummy input to be 1
T = sign(trainExamples(2,:).^2-4*trainExamples(1,:).*trainExamples(3,:)); % The teacher provides this for every example
%The student neuron starts with random weights
w1 = rand(4,nSecondLayer);
w2 = rand(nSecondLayer,nOutput);
nwrong = 1;
S1(nSecondLayer,nTrain) = 0;
S2(nOutput,nTrain) = 0;
while( nwrong>1e-2 ) % more then some small number close to zero
for i=1:nTrain
x = trainExamples(:,i);
S2(:,i) = w2'*S1(:,i);
deltak = tanh(S2(:,i)) - T(:,i); % back propagate
deltaj = (1-tanh(S2(:,i)).^2).*(w2*deltak); % back propagate
w2 = w2 - tanh(S1(:,i))*deltak'; % updating
w1 = w1- x*deltaj'; % updating
output = tanh(w2'*tanh(w1'*trainExamples));
dOutput = output-T;
nwrong = sum(abs(dOutput));
nepochs = nepochs+1
Upvotes: 1
Views: 6250
Reputation: 653
After a few days of bashing my head against the wall I discovered a small typo. Below is a working solution:
% Set up parameters
nInput = 4; % number of nodes in input
nOutput = 1; % number of nodes in output
nHiddenLayer = 7; % number of nodes in th hidden layer
nTrain = 1000; % size of training set
epsilon = 0.01; % learning rate
% Set up the inputs: random coefficients between -1 and 1
trainExamples = 2*rand(nInput,nTrain)-1;
trainExamples(nInput,:) = ones(1,nTrain); %set the last input to be 1
% Set up the student neurons for both hidden and the output layers
S1(nHiddenLayer,nTrain) = 0;
S2(nOutput,nTrain) = 0;
% The student neuron starts with random weights from both input and the hidden layers
w1 = rand(nInput,nHiddenLayer);
w2 = rand(nHiddenLayer+1,nOutput);
% Calculate the teacher outputs according to the quadratic formula
T = sign(trainExamples(2,:).^2-4*trainExamples(1,:).*trainExamples(3,:));
% Initialise values for looping
nEpochs = 0;
nWrong = nTrain*0.01;
Wrong = [];
Epoch = [];
while(nWrong >= (nTrain*0.01)) % as long as more than 1% of outputs are wrong
for i=1:nTrain
x = trainExamples(:,i);
S1(1:nHiddenLayer,i) = w1'*x;
S2(:,i) = w2'*[tanh(S1(:,i));1];
delta1 = tanh(S2(:,i)) - T(:,i); % back propagate
delta2 = (1-tanh(S1(:,i)).^2).*(w2(1:nHiddenLayer,:)*delta1); % back propagate
w1 = w1 - epsilon*x*delta2'; % update
w2 = w2 - epsilon*[tanh(S1(:,i));1]*delta1'; % update
outputNN = sign(tanh(S2));
delta = outputNN - T; % difference between student and teacher
nWrong = sum(abs(delta/2));
nEpochs = nEpochs + 1;
Wrong = [Wrong nWrong];
Epoch = [Epoch nEpochs];
Upvotes: 2