MLP Neural network not training correctly, probably converging to a local minimum

Question

I'm making a MLP neural network with back-propagation in matlab. The problem is, it seems not to be able to handle the curves in a function well, and also doesn't scale well with the values. It can for example reach 80% of the cos(x) but if I put 100*cos(x) it will just not train at all.

What is even weirder is, that some functions it can train well to, while others it just doesn't work at all.. For example: Well trained: http://img515.imageshack.us/img515/2148/coscox3.jpg

Not so well: http://img252.imageshack.us/img252/5370/cos2d.jpg (smoothness from being left a long time)

Wrong results, stuck like this: http://img717.imageshack.us/img717/2145/ex2ug.jpg

This is the algo I'm trying to implement:

http://img594.imageshack.us/img594/9590/13012012001.jpg

http://img27.imageshack.us/img27/954/13012012002.jpg

And this is my implementation:

close all;clc;

j=[4,3,1]; %number neurons in hidden layers and output layer
i=[1,j(1),j(2)];

X=0:0.1:pi;
d=cos(X);

%-----------Weights------------%
%-----First layer weights------%
W1p=rand([i(1)+1,j(1)]);
W1p=W1p/sum(W1p(:));
W1=rand([i(1)+1,j(1)]);
W1=W1/sum(W1(:));

%-----Second layer weights------%
W2p=rand([i(2)+1,j(2)]);
W2p=W2p/sum(W2p(:));
W2=rand([i(2)+1,j(2)]);
W2=W2/sum(W2(:));

%-----Third layer weights------%
W3p=rand([i(3)+1,j(3)]);
W3p=W3p/sum(W3p(:));
W3=rand([i(3)+1,j(3)]);
W3=W3/sum(W3(:));
%-----------/Weights-----------%

V1=zeros(1,j(1));
V2=zeros(1,j(2));
V3=zeros(1,j(3));

Y1a=zeros(1,j(1));
Y1=[0 Y1a];
Y2a=zeros(1,j(2));
Y2=[0 Y2a];

O=zeros(1,j(3));
e=zeros(1,j(3));

%----Learning and forgetting factor-----%
alpha=0.1;
etha=0.1;
sortie=zeros(1,length(X));
while(1)

n=randi(length(X),1);
%---------------Feed forward---------------%

%-----First layer-----%
X0=[-1 X(:,n)];
V1=X0*W1;
Y1a=tanh(V1/2);

%----Second layer-----%
Y1=[-1 Y1a];
V2=Y1*W2;
Y2a=tanh(V2/2);

%----Output layer-----%
Y2=[-1 Y2a];
V3=Y2*W3;
O=tanh(V3/2);
e=d(n)-O;
sortie(n)=O;

%------------/Feed Forward-----------------%

%------------Backward propagation---------%

%----Output layer-----%
delta3=e*0.5*(1+O)*(1-O);
W3n=W3+ alpha*(W3-W3p) + etha * delta3 * W3;

%----Second Layer-----%
delta2=zeros(1,length(Y2a));
for b=1:length(Y2a)
delta2(b)=0.5*(1-Y2a(b))*(1+Y2a(b)) * sum(delta3*W3(b+1,1));
end

W2n=W2 + alpha*(W2-W2p)+ (etha * delta2'*Y1)';

%----First Layer-----%
delta1=zeros(1,length(Y1a));
for b=1:length(Y1a)
    for m=1:length(Y2a)
          delta1(b)=0.5*(1-Y1a(b))*(1+Y1a(b)) * sum(delta2(m)*W2(b+1,m));


    end
end


W1n=W1+ alpha*(W1-W1p)+ (etha * delta1'*X0)';                                    
W3p=W3;
W3=W3n;

W2p=W2;
W2=W2n;

W1p=W1;
W1=W1n;

figure(1);
plot(1:length(d),d,1:length(d),sortie);

drawnow;
end

My question is, what can I do to correct it? My guesses so far are, I either have something wrong in the back propagation, specifically in calculating delta and the weights. Or I have the weights initialized wrong (too small, or not dependent on the initial input)..

Fuzz · Accepted Answer

I am not an expert in this field, but have had some experience playing with Matlab and Java based Neural Network Systems.

I can suggest that usage of the toolbox could help you, it has helped others that I know.

I can offer a few points of information:

Do not expect NN's to work on all training data, sometimes the data is too complicated for classification in this manner
The format of your NN will have a drastic impact on the convergence performace

Finally:

Training algorithms like this will often train better when the various parameters are normalized to +/- 1. cos(x) is normalized, 100*cos*(x) is not. This is because the weighting updates required are much larger, and the training system might be taking very small steps. If you are data with multiple different ranges, then normalization is vital. Might I suggest you start with, at the very least, investigating that

MLP Neural network not training correctly, probably converging to a local minimum

Answers (1)

Related Questions