user12158459
user12158459

Reputation:

Lagrange Multiplier Method using NLsolve.jl

I would like to minimize a distance function ||dz - z|| under the constraint that g(z) = 0. I wanted to use Lagrange Multipliers to solve this problem. Then I used NLsolve.jl to solve the non-linear equation that I end up with.

using NLsolve
using ForwardDiff

function ProjLagrange(dz, g::Function)
    λ_init = ones(size(g(dz...),1))
    initial_x = vcat(dz, λ_init)

    function gradL!(F, x)
        len_dz = length(dz)
        z = x[1:len_dz]
        λ = x[len_dz+1:end]
        
        F = Array{Float64}(undef, length(x))

        my_distance(z) = norm(dz - z)
        ∇f = z -> ForwardDiff.gradient(my_distance, z)
        F[1:len_dz] = ∇f(z) .- dot(λ, g(z...))

        if length(λ) == 1
            F[end] = g(z...) 
        else
            F[len_dz+1:end] = g(z) 
        end
    end
    nlsolve(gradL!, initial_x)
end


g_test(x1, x2, x3) = x1^2 + x2 - x2 + 5
z = [1000,1,1]

ProjLagrange(z, g_test)

But I always end up with Zero: [NaN, NaN, NaN, NaN] and Convergence: false. Just so you know I have already solved the equation by using Optim.jl and minimizing the following function: Proj(z) = b * sum(abs.(g(z))) + a * norm(dz - z). But I would really like to know if this is possible with NLsolve. Any help is greatly appreciated!

Upvotes: 1

Views: 606

Answers (2)

Daniel Schmid
Daniel Schmid

Reputation: 36

I altered your code as below (see my comments in there) and got the following output. It doesn't throw NaNs anymore, reduces the objective and converges. Does this differ from your Optim.jl results?

Results of Nonlinear Solver Algorithm
 * Algorithm: Trust-region with dogleg and autoscaling
 * Starting Point: [1000.0, 1.0, 1.0, 1.0]
 * Zero: [9.80003, -49.5203, 51.5203, -0.050888]
 * Inf-norm of residuals: 0.000000
 * Iterations: 10
 * Convergence: true
   * |x - x'| < 0.0e+00: false
   * |f(x)| < 1.0e-08: true
 * Function Calls (f): 11
 * Jacobian Calls (df/dx): 11
using NLsolve
using ForwardDiff
using LinearAlgebra: norm, dot
using Plots

function ProjLagrange(dz, g::Function, n_it)
    λ_init = ones(size(g(dz),1))
    initial_x = vcat(dz, λ_init)

    # These definitions can go outside as well
    len_dz = length(dz)
    my_distance = z -> norm(dz - z)
    ∇f = z -> ForwardDiff.gradient(my_distance, z)
    # In fact, this is probably the most vital difference w.r.t. your proposal.
    # We need the gradient of the constraints.
    ∇g = z -> ForwardDiff.gradient(g, z)
    
    function gradL!(F, x)
        z = x[1:len_dz]
        λ = x[len_dz+1:end]

        # `F` is memory allocated by NLsolve to store the residual of the
        # respective call of `gradL!` and hence doesn't need to be allocated
        # anew every time (or at all).
        F[1:len_dz] = ∇f(z) .- λ .* ∇g(z)
        F[len_dz+1:end] .= g(z)
    end

    return nlsolve(gradL!, initial_x, iterations=n_it, store_trace=true)
end

# Presumable here is something wrong: x2 - x2 is not very likely, also made it
# callable directly with an array argument
g_test = x -> x[1]^2 + x[2] - x[3] + 5
z = [1000,1,1]
n_it = 10000

res = ProjLagrange(z, g_test, n_it)

# Ugly reformatting here
trace = hcat([[state.iteration; state.fnorm; state.stepnorm] for state in res.trace.states]...)
plot(trace[1,:], trace[2,:], label="f(x) inf-norm", xlabel="steps")

Evolution of inf-norm of f(x) over iteration steps

[Edit: Adapted solution to incorporate correct gradient computation for g()]

Upvotes: 0

Benoit Pasquier
Benoit Pasquier

Reputation: 3015

Starting almost from scratch and wikipedia's Lagrange multiplier page because it was good for me, the code below seemed to work. I added an λ₀s argument to the ProjLagrange function so that it can accept a vector of initial multiplier λ values (I saw you initialized them at 1.0 but I thought this was more generic). (Note this has not been optimized for performance!)

using NLsolve, ForwardDiff, LinearAlgebra
function ProjLagrange(x₀, λ₀s, gs, n_it)
    # distance function from x₀ and its gradients
    f(x) = norm(x - x₀)
    ∇f(x) = ForwardDiff.gradient(f, x)

    # gradients of the constraints
    ∇gs = [x -> ForwardDiff.gradient(g, x) for g in gs]

    # Form the auxiliary function and its gradients
    ℒ(x,λs) = f(x) - sum(λ * g(x) for (λ,g) in zip(λs,gs))
    ∂ℒ∂x(x,λs) = ∇f(x) - sum(λ * ∇g(x) for (λ,∇g) in zip(λs,∇gs))
    ∂ℒ∂λ(x,λs) = [g(x) for g in gs]
    
    # as a function of a single argument
    nx = length(x₀)
    ℒ(v) = ℒ(v[1:nx], v[nx+1:end])
    ∇ℒ(v) = vcat(∂ℒ∂x(v[1:nx], v[nx+1:end]), ∂ℒ∂λ(v[1:nx], v[nx+1:end]))

    # and solve
    v₀ = vcat(x₀, λ₀s)
    nlsolve(∇ℒ, v₀, iterations=n_it)
end

# test
gs_test = [x -> x[1]^2 + x[2] - x[3] + 5]
λ₀s_test = [1.0]
x₀_test = [1000.0, 1.0, 1.0]
n_it = 100

res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)

gives me

julia> res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
Results of Nonlinear Solver Algorithm
 * Algorithm: Trust-region with dogleg and autoscaling
 * Starting Point: [1000.0, 1.0, 1.0, 1.0]
 * Zero: [9.800027199717013, -49.52026655749088, 51.520266557490885, -0.050887973682118504]
 * Inf-norm of residuals: 0.000000
 * Iterations: 10
 * Convergence: true
 * |x - x'| < 0.0e+00: false
 * |f(x)| < 1.0e-08: true
 * Function Calls (f): 11
 * Jacobian Calls (df/dx): 11

Upvotes: 1

Related Questions