Minimazing quadratic cost function: check validity of algorithm and how to optimize it

Question

I would like to estimate parameters of a PSF and the linear coefficients, given output data "y". The relation is given by :

I have the following function

I have to estimate all the 6 parameters (a,b,r0,c0,alpha,beta) given only "y" output data.

For this, I have coded the following Cost function to minimize :

function [J, Model] = Crit_J(p,D)

% Compute the model corresponding to parameters p
[R,C] = size(D);
[Cols,Rows] = meshgrid(1:C,1:R);
% Model
Model = p(1)*(1+((Rows-p(3)).^2+(Cols-p(4)).^2)/p(5)^2).^(-p(6)) + ...                   
        p(2)*ones(R,C); 
model = Model(:);                                                                        
% Introduce H matrix                                                                     
d = D(:);
H = [ model, ones(length(model),1)];                                                     
% Compute the cost function : taking absolute value                                      
J = abs((d-H*[p(1),p(2)]')'*(d-H*[p(1),p(2)]'));                                         

end

with vector p :

% Parameters theta and nu: initial values
a = 10;
b = 3;
r0 = 0;
c0 = 0;
alpha = 1;
beta = 1;
% Array of parameters
p = [a,b,r0,c0,alpha,beta]';

I must precise that matrix H is equal to (with R=C=20) :

In main program, I am using the Crit_J function this way :

    % Initial value pour J cost function
    Jmin = Inf;
    % Initial value for p vector
    init = 10;
% Minimization
%for i=1:1000000
kmax = 1e6
epsilon = 100;
k = 0;
while (k < kmax && Jmin > epsilon)
   p = init*randn(6,1);
   J = Crit_J(p,D);
   if (J < Jmin)
     Jmin = J; 
     pCurrent = p;
   end
   k=k+1;
end

Unfortunately, the estimations are not estimated well, even with a high number of iterations.

I don't know what initial values that I have to take for parameters (before the loop).

If I use for iteration : while (Jmin > epsilon), the program never returns since the epsilon value may be too small. I prefer use the classical for i=1:1000000.

That's why I would like to check if the minimization algorithm is right.

How can I confirm that the function Crit_J is correctly defined and used in main program ?

Does it exist Matlab function to achieve this mimimization ?

Update 1

I follow the answer given by @inkert and use Crit_J2 function like this :

function cost = Crit_J2(p,D,alpha,beta,r0,c0)

% Compute the model corresponding to parameters p
[R,C] = size(D);
[Cols,Rows] = meshgrid(1:C,1:R);
% Model
Model = p(1)*(1+((Rows-r0).^2+(Cols-c0).^2)/alpha^2).^(-beta) + ... 
        p(2)*ones(R,C);
%Model = p(1)*D + p(2)*ones(R,C); 
model = Model(:);
d = D(:);
%disp(model);
% Introduce H matrix 
H = [ model, ones(length(model),1)];
% Compute the cost function : taking absolute value
J2 = abs((d-H*[p(1),p(2)]')'*(d-H*[p(1),p(2)]'));
cost = J2;

end

Unfortunately, even I take initial parameters close from which have been fixed at the beginning of code, the estimations are bad.

I wonder if my quadratic cost function is valid; indeed, I have used the following expression :

J = abs((d-H*[p(1),p(2)]')'*(d-H*[p(1),p(2)]'));

d (= D(:))is the output data vector, H is the matrix allowing to write under matricial form :

d = H.p + epsilon

Do I have to add noise in the definition of J cost function ?

You can get the output file 2D on : example output.dat

and I get :

p_estimate = 

    1.1610
    2.3271
   13.2684
   13.6551
    3.3586
    0.4541

and expected : 14 
               5
               8.56
               10.995
               3
               2.45

Here the image for illustrating :

As you can see, there is an issue somewhere in my code.

Update 2

Bad estimations are also produced with Matlab's function "ga". I used it in this way :

% Linear inequalities : A*x < b                                                                      
A1 = [0, 0, -1, 0, 0, 0];                                                                
b1 = 0;                                                                                  
A2 = [0, 0, 0, -1, 0, 0];                                                                
b2 = 0;                                                                                  
A3 = [0, 0, 0, 0, -1, 0];                                                                
b3 = 0;                                                                                  
A4 = [0, 0, 0, 0, 0, -1];                                                                
b4 = -1;                                                                                 
%% Matrix A and b                                                                        
A = [A1; A2; A3; A4];                                                                    
b = [b1; b2; b3; b4];                                                                    
%                                                                                        
% 2 lines below : attempt to find minimum                                                
[pars, Jmin] = ga(@(x)Crit_J(x,D),6,A,b,[],[],[],[],[]);                                 

% Function  of cost                                                                      
function cost = Crit_J(p,D)

% Compute the model corresponding to parameters p                                        
[R,C] = size(D);
[Cols,Rows] = meshgrid(1:C,1:R);                                                         
% Model
Model = p(1)*(1+((Rows-p(3)).^2+(Cols-p(4)).^2)/p(5)^2).^(-p(6)) + ...                   
        p(2)*ones(R,C);                                                                  
model = Model(:);                                                                        
% Introduce H matrix 
d = D(:);
H = [ model, ones(length(model),1)];                                                     
% Compute the cost function : taking absolute value                                      
J = abs((d-H*[p(1),p(2)]')'*(d-H*[p(1),p(2)]'));                                            
cost = J;
end

Inequalities are the following ones :

p = [a,b,r0,c0,alpha,beta];

a=p(1) > 0
b=p(2) > 0
r0=p(3) > 0
c0=p(4) > 0
alpha=p(5) > 0
beta=p(6) > 1

Anyone could see an error with these inequalities or with the using of "ga" ?

Update 3

Here my last results about minimazing the cost function : good news and bad news.

1) Good news : right now, the 3 methods (fminsearch, fminunc and ga) gives the same estimations about 2 randoms parameters (a and b into y = a*PSF+b).

2) Bad news : the estimations on a and b are too small.

Now, I put the little code for the 3 methods (i.e the 3 functions cited above) :

% Amplitude value = a
amplitude = 14;

% Sky background = b
backgvalue = 5;
backg = backgvalue*ones(L,C);

% White Noise sigma
variance = 1;
WhiteNoise = sqrt(variance)*randn(L,C);

% Generate grid
[Cols,Rows] = meshgrid(1:C,1:L);
% Pack parameters
nu = [r0; c0; alpha; beta];
% Create normalizded PSF with amplitude = 1 and without background and %gaussian white noise
PSF = Moffat(nu,Rows,Cols);
% Normalized PSF and general case
Dnorm = PSF;
D = amplitude*PSF + backg + WhiteNoise;
    a = 3;
    b = 1;
    p0 = [a,b];
    
    %%%%%%%%%%%% fminsearch %%%%%%%%%%%%%%%%%%
    pars = fminsearch(@(x)Crit_J2(x,D,alpha,beta,r0,c0),p0);
    disp('pars : fminsearch');
    pars
    
    %%%%%%%%%%%% fminunc %%%%%%%%%%%%%%%%%%
    [pars, Jmin] = fminunc(@(x)Crit_J2(x,D,alpha,beta,r0,c0),p0);
    disp('pars : fminunc');
    pars
    
    %%%%%%%%%%%% ga %%%%%%%%%%%%%%%%%%
    %%% impose a > 0 and b > 0
    A1lin = [-1, 0];
    b1lin = 0;
    A2lin = [0, -1];
    b2lin = 0;
    
    %%% Matrix A and b
    Alin = [A1lin; A2lin];
    blin = [b1lin; b2lin];
    [pars, Jmin] = ga(@(x)Crit_J2(x,D,alpha,beta,r0,c0),2,Alin,blin);
    disp('pars : ga');
    pars

The estimation is about : a = 3.74 and b = 1.04 (instead of starting fixed values a = 14 and b = 5)

Should I conclude the issue comes from the definition of cost function used in my code ?, i.e :

% Function  of cost
function cost = Crit_J2(p,D,alpha,beta,r0,c0)

% Compute the model corresponding to parameters p
[R,C] = size(D);
[Cols,Rows] = meshgrid(1:C,1:R);
% Model
Model = p(1)*(1+((Rows-r0).^2+(Cols-c0).^2)/alpha^2).^(-beta) + ... 
        p(2)*ones(R,C);
%Model = p(1)*D + p(2)*ones(R,C); 
model = Model(:);
d = D(:);
%disp(model);
% Introduce H matrix 
H = [ model, ones(length(model),1)];
% Compute the cost function : taking absolute value
J2 = abs((d-H*[p(1),p(2)]')'*(d-H*[p(1),p(2)]'));
cost = J2;

end

I am using the following cost function :

J = abs((d-H*[p(1),p(2)]')'*(d-H*[p(1),p(2)]'));

with d the vector of initial data generated at the beginning of estimation, and H the matrix such that d = H.theta with theta=[a,b]=[p(1),p(2)]'

What is the explanation of this bad estimation (under estimation) of [a,b] (amplitude of star + background) ?

rinkert · Accepted Answer

Right now you are just randomly changing your parameters. If you change to an optimization procedure like implemented in fminsearch, this algorithm will find the minimum for you.

% initial parameters
a = 10;
b = 3;
r0 = 0;
c0 = 0;
alpha = 1;
beta = 1;
% Array of parameters
p0 = [a,b,r0,c0,alpha,beta]';


% Minimization of cost function for p 
[pars, Jmin] = fminsearch(@(x)Crit_J(x,D), p0);

You have to change your Crit_J function to only output the 'cost' to minimize for this to work. You can then later obtain your model by running the original function.

Update:

Perhaps you are finding a local minimum, you can try fminunc instead of fminsearch, or switch to a generic algorithm, but that might be overkill for this problem.

Minimazing quadratic cost function: check validity of algorithm and how to optimize it

Update 1

Update 2

Update 3

Answers (1)

Related Questions