bido
bido

Reputation: 47

Error in evaluating a function

EDIT: The code that I have pasted is too long. Basicaly I dont know how to work with the second code, If I know how calculate alpha from the second code I think my problem will be solved. I have tried a lot of input arguments for the second code but it does not work!

I have written following code to solve a convex optimization problem using Gradient descend method:

function [optimumX,optimumF,counter,gNorm,dx] = grad_descent()

x0 = [3 3]';%'//
terminationThreshold = 1e-6;
maxIterations = 100;
dxMin = 1e-6;

gNorm = inf; x = x0; counter = 0; dx = inf;

% ************************************
f = @(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
%alpha = 0.01;
% ************************************

figure(1); clf; ezcontour(f,[-5 5 -5 5]); axis equal; hold on

f2 = @(x) f(x(1),x(2));

% gradient descent algorithm:
while and(gNorm >= terminationThreshold, and(counter <= maxIterations, dx >= dxMin))
    g = grad(x);
    gNorm = norm(g);

    alpha = linesearch_strongwolfe(f,-g, x0, 1);

    xNew = x - alpha * g;
    % check step
    if ~isfinite(xNew)
        display(['Number of iterations: ' num2str(counter)])
        error('x is inf or NaN')
    end
    % **************************************
    plot([x(1) xNew(1)],[x(2) xNew(2)],'ko-')
    refresh
    % **************************************

    counter = counter + 1;
    dx = norm(xNew-x);
    x = xNew;
end
optimumX = x;
optimumF = f2(optimumX);
counter = counter - 1;

% define the gradient of the objective
function g = grad(x)
g = [(8*x(1) + 2*x(2) +10)
    (2*x(1) + 16*x(2) + 1)];
end

end

As you can see, I have commented out the alpha = 0.01; part. I want to calculate alpha via an other code. Here is the code (This code is not mine)

function alphas = linesearch_strongwolfe(f,d,x0,alpham)

alpha0 = 0;
alphap = alpha0;
c1 = 1e-4;
c2 = 0.5;
alphax = alpham*rand(1);
[fx0,gx0] = feval(f,x0,d);
fxp = fx0;
gxp = gx0;
i=1;

while (1 ~= 2)
  xx = x0 + alphax*d;
  [fxx,gxx] = feval(f,xx,d);
  if (fxx > fx0 + c1*alphax*gx0) | ((i > 1) & (fxx >= fxp)),
    alphas = zoom(f,x0,d,alphap,alphax);
    return;
  end
  if abs(gxx) <= -c2*gx0,
    alphas = alphax;
    return;
  end
  if gxx >= 0,
    alphas = zoom(f,x0,d,alphax,alphap);
    return;
  end
  alphap = alphax;
  fxp = fxx;
  gxp = gxx;
  alphax = alphax + (alpham-alphax)*rand(1);
  i = i+1;
end

function alphas = zoom(f,x0,d,alphal,alphah)
c1 = 1e-4;
c2 = 0.5;
[fx0,gx0] = feval(f,x0,d);

while (1~=2),
   alphax = 1/2*(alphal+alphah);
   xx = x0 + alphax*d;
   [fxx,gxx] = feval(f,xx,d);
   xl = x0 + alphal*d;
   fxl = feval(f,xl,d);
   if ((fxx > fx0 + c1*alphax*gx0) | (fxx >= fxl)),
      alphah = alphax;
   else
      if abs(gxx) <= -c2*gx0,
        alphas = alphax;
        return;
      end
      if gxx*(alphah-alphal) >= 0,
        alphah = alphal;
      end
      alphal = alphax;
   end
end 

But I get this error:

Error in linesearch_strongwolfe (line 11) [fx0,gx0] = feval(f,x0,d);

As you can see I have written the f function and its gradient manually. linesearch_strongwolfe(f,d,x0,alpham) takes a function f, Gradient of f, a vector x0 and a constant alpham. is there anything wrong with my declaration of f? This code works just fine if I put back alpha = 0.01;

Upvotes: 2

Views: 165

Answers (1)

As I see it:

x0 = [3; 3]; %2-element column vector
g = grad(x0); %2-element column vector
f = @(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
linesearch_strongwolfe(f,-g, x0, 1); %passing variables

inside the function:

[fx0,gx0] = feval(f,x0,-g); %variable names substituted with input vars

This will in effect call

[fx0,gx0] = f(x0,-g);

but f(x0,-g) is a single 2-element column vector with these inputs. Assingning the output to two variables will not work.

You either have to define f as a proper named function (just like grad) to output 2 variables (one for each component), or edit the code of linesearch_strongwolfe to return a single variable, then slice that into 2 separate variables yourself afterwards.


If you experience a very rare kind of laziness and don't want to define a named function, you can still use an anonymous function at the cost of duplicating code for the two components (at least I couldn't come up with a cleaner solution):

f = @(x1,x2) deal(4.*x1(1)^2 + 2.*x1(1)*x2(1) +8.*x2(1)^2 + 10.*x1(1) + x2(1),...
                  4.*x1(2)^2 + 2.*x1(2)*x2(2) +8.*x2(2)^2 + 10.*x1(2) + x2(2));

[fx0,gx0] = f(x0,-g); %now works fine

as long as you always have 2 output variables. Note that this is more like a proof of concept, since this is ugly, inefficient, and very susceptible to typos.

Upvotes: 2

Related Questions