Reputation: 71

(Openmdao 2.4.0) difference between providing no derivatives / forcing FD on a disciplines with derivatives

this question is in line with this one but it is not the same. The objective is still for students purpose ! Still playing with Sellar problem , I compared the 2 different problems :

problem 1 : MDA of Sellar without derivatives information on Disciplines with Newton solver as NonlinearSolver
problem 2 : MDA of Sellar with derivatives information on Disciplines with Newton solver as NonlinearSolver but with the options declare_partials('', '', method='fd') on each discipline in the problem level
for both, the linearsolver is the same and both compute the MDA for a similar point

I was expecting to obtain a similar results in terms of calls to disciplines (as, in my expectations, if no analytical derivatives are given and Newton is used, somewhere, FD have to be used to feed the Newton solver, and in that case, forcing FD when derivatives are given should lead to a similar solution ). But problem 1 has the following solution : Number of discipline2 calls: 9 and probleme 2 has the following solution: Number of disciplines calls: 13

Therefore both problems are not equivalent from OpenMDAO perspective. It should come from the way to solve a group with coupling with Newton when no analytical derivatives are provided but I would like to understand how it works .

Upvotes: 2

Answers (1)

Justin Gray

Reputation: 5710

This was a bit of a head-scratcher for sure. Below is a self-contained version of sellar that works on OpenMDAO V2.5, despite using a NewtonSolver while NOT providing any derivatives. That seemingly should not work at all, but it does (though it does take more iterations than when you declared derivatives with FD)!!.

So what is going on here is a bit subtle, and is a function of how ExplicitComponent is actually implemented under the covers in OpenMDAO. I'll refer you to the OpenMDAO paper for more details, but OpenMDAO actually converts everything to an implicit form under the covers. So every explicit outputs actually gets a residual of the form R(output_var) = compute(inputs)[output_var] - outputs[output_var].

So what happens when you run newton is that the compute function gets called, and then a residual is formed drives the output variable vector to match the computed values. This is accomplished using the standard Newton-method: [dR/du] [delta-u] = -[R(u)].

So how does this work at all, if you don't provide any derivatives? Well, note that dR_i/du_i = -1 (this is the derivative of an residual for an explicit variable with respect to the associated value in the output vector). The OpenMDAO ExplicitComponent class defines this one partial derivative automatically. There are other derivatives with respect to the inputs that then get provided by the sub-class of ExplicitComponent. So when you did not define any derivatives, you still got that dR_i/du_i = -1.

Then the Newton-method degenerated to -[I] [delta-u] = -[R(u)]. Which meant that the computed update at each iteration was just equal to the negative of the residual. Mathematically, this is effectively the same thing as solving using the NonlinearBlockJacobi solver.

This somewhat unintuitive behavior happened because the ExplicitComponent internally handles the implicit transformation and the associated derivative itself. However, if you had defined the Sellar components as sub-classes of an ImplicitComponent instead, then not declaring derivatives would not have worked. Also, note that you would not have been able to do optimization with this model without the FD-d derivatives either. It was just a quirk of the ExplicitComponent implementation that made the MDA work in this case.

import numpy as np

from openmdao.api import ExplicitComponent, Group, Problem, NewtonSolver, \
                         DirectSolver, IndepVarComp, ExecComp

class SellarDis1(ExplicitComponent):
    """
    Component containing Discipline 1 -- no derivatives version.
    """

    def setup(self):

        # Global Design Variable
        self.add_input('z', val=np.zeros(2))

        # Local Design Variable
        self.add_input('x', val=0.)

        # Coupling parameter
        self.add_input('y2', val=1.0)

        # Coupling output
        self.add_output('y1', val=1.0)

        # Finite difference all partials.
        # self.declare_partials('*', '*', method='fd')

    def compute(self, inputs, outputs):
        """
        Evaluates the equation
        y1 = z1**2 + z2 + x1 - 0.2*y2
        """
        z1 = inputs['z'][0]
        z2 = inputs['z'][1]
        x1 = inputs['x']
        y2 = inputs['y2']

        outputs['y1'] = z1**2 + z2 + x1 - 0.2*y2
        print('compute y1', outputs['y1'])

class SellarDis2(ExplicitComponent):
    """
    Component containing Discipline 2 -- no derivatives version.
    """

    def setup(self):
        # Global Design Variable
        self.add_input('z', val=np.zeros(2))

        # Coupling parameter
        self.add_input('y1', val=1.0)

        # Coupling output
        self.add_output('y2', val=1.0)

        # Finite difference all partials.
        # self.declare_partials('*', '*', method='fd')

    def compute(self, inputs, outputs):
        """
        Evaluates the equation
        y2 = y1**(.5) + z1 + z2
        """

        z1 = inputs['z'][0]
        z2 = inputs['z'][1]
        y1 = inputs['y1']
        print('y1', y1)

        # Note: this may cause some issues. However, y1 is constrained to be
        # above 3.16, so lets just let it converge, and the optimizer will
        # throw it out
        if y1.real < 0.0:
            y1 *= -1

        outputs['y2'] = y1**.5 + z1 + z2

class SellarMDA(Group):
    """
    Group containing the Sellar MDA.
    """

    def setup(self):
        indeps = self.add_subsystem('indeps', IndepVarComp(), promotes=['*'])
        indeps.add_output('x', 1.0)
        indeps.add_output('z', np.array([5.0, 2.0]))

        cycle = self.add_subsystem('cycle', Group(), promotes=['*'])
        cycle.add_subsystem('d1', SellarDis1(), promotes_inputs=['x', 'z', 'y2'], promotes_outputs=['y1'])
        cycle.add_subsystem('d2', SellarDis2(), promotes_inputs=['z', 'y1'], promotes_outputs=['y2'])

        # Nonlinear Block Gauss Seidel is a gradient free solver
        newton = cycle.nonlinear_solver = NewtonSolver()
        newton.options['iprint'] = 2
        newton.options['maxiter'] = 20
        newton.options['solve_subsystems'] = False
        cycle.linear_solver = DirectSolver()

        self.add_subsystem('obj_cmp', ExecComp('obj = x**2 + z[1] + y1 + exp(-y2)',
                           z=np.array([0.0, 0.0]), x=0.0),
                           promotes=['x', 'z', 'y1', 'y2', 'obj'])

        self.add_subsystem('con_cmp1', ExecComp('con1 = 3.16 - y1'), promotes=['con1', 'y1'])
        self.add_subsystem('con_cmp2', ExecComp('con2 = y2 - 24.0'), promotes=['con2', 'y2'])


prob = Problem()

prob.model = SellarMDA()

prob.setup()

prob['x'] = 2.
prob['z'] = [-1., -1.]

prob.run_model()


print(prob['y1'])
print(prob['y2'])

Upvotes: 1

(Openmdao 2.4.0) difference between providing no derivatives / forcing FD on a disciplines with derivatives

Answers (1)

Related Questions