Kid Charlamagne
Kid Charlamagne

Reputation: 588

Default options in computing the Jacobian in scipy.optimize.root

In the documentations for scipy.optimize.root with method = lm the following are the default for options keyword.

options={'col_deriv': 0, 'diag': None, 'factor': 100, 'gtol': 0.0, 'eps': 0.0, 'func': None, 'maxiter': 0, 'xtol': 1.49012e-08, 'ftol': 1.49012e-08}

On the description for col_deriv they say that

col_deriv : bool, optional non-zero to specify that the Jacobian function computes derivatives down the columns (faster, because there is no transpose operation)

If I understand the statement it says that if I write col_deriv = True for example, the jacobian will be computed column wise and therefore faster.

Question: If it is faster, why isn't a non zero value for col_deriv the default value?.

Am I missing something here?.

Upvotes: 1

Views: 1409

Answers (2)

AGN Gazer
AGN Gazer

Reputation: 8378

Maybe the documentation from scipy.optimize.leastsq could help since it documents both Dfun (Jacobian) and col_deriv. From Dfun we get:

Dfun : callable, optional

A function or method to compute the Jacobian of func with derivatives across the rows. If this is None, the Jacobian will be estimated.

From col_deriv we get:

col_deriv : bool, optional

non-zero to specify that the Jacobian function computes derivatives down the columns (faster, because there is no transpose operation).

My reading of this is as follows:

  1. By default, scipy expects that a function that computes the Jacobian matrix return a matrix that follows "normal" definition (see, e.g., https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant).

  2. However, scipy itself calls other functions, possibly written in Fortran see, e.g., minpack, which expect that derivatives (with regard to coordinates) are placed in columns.

It follows that if the function that computes the Jacobian matrix could return a matrix with derivatives placed along columns instead of rows, then scipy will not need to transpose the Jacobian matrix before passing it to the minpack function thus saving computational time.

Upvotes: 2

user6655984
user6655984

Reputation:

The default corresponds to the mathematical convention for writing the Jacobian matrix, as recorded on Wikipedia among other places: the first row of the matrix consists of the partial derivatives of the first component of the function, etc. This works better in multivariable calculus, because one can multiply that matrix on the right by the column vector of variables to obtain the linear approximation to the function.

For example, let

fun = lambda x: [x[0]**3-x[1], 2*x[0]+x[1]-12]

The conventional way of writing the Jacobian would be

jac = lambda x: [[3*x[0]**2, -1], [2, 1]]

where [3*x[0]**2, -1] comes from differentiating the first component of fun, x[0]**3-x[1].

The method root(fun, [0, 0], jac=jac) converges in 12 iterations. Or we could write the Jacobian in transposed way,

jact = lambda x: [[3*x[0]**2, 2], [-1, 1]]

and use

root(fun, [0, 0], jac=jact, options={"col_deriv": 1})

to the same effect. It's not obvious to me that the gain is worth it, but perhaps it is for very large systems.

If "col_deriv": 1 was the default, there'd be unhappy users who implemented the Jacobian in the way they are used to and found the performance is poor (using an incorrect version of Jacobian in the above example more than doubles the number of steps of root search).

Upvotes: 2

Related Questions