redfol
redfol

Reputation: 13

equation system with fsolve

I try to find a solution for a system of equations by using scipy.optimize.fsolve in python 2.7. The goal is to calculate equilibrium concentrations for a chemical system. Due to the nature of the problem, some of the constants are very small. Now for some combinations i do get a proper solution. For some parameters i don't find a solution. Either the solutions are negative, which is not reasonable from a physical point of view or fsolve produces:
ier = 3, 'xtol=0.000000 is too small, no further improvement in the approximate\n solution is possible.')
ier = 4, 'The iteration is not making good progress, as measured by the \n improvement from the last five Jacobian evaluations.')
ier = 5, 'The iteration is not making good progress, as measured by the \n improvement from the last ten iterations.')

It seems to me, based on my research, that the failure to find proper solutions of the equation system is connected to the datatype float.64 not being precise enough. As a friend pointed out, the system is not well conditioned with parameters differing in several magnitudes. So i tried to use fsolve with the mpfr type provided by the gmpy2 module but that resulted in the following error:
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

Now here is a small example with parameter which lead to a solution if the randomized starting parameters fit happen to be good. However if the constant C_HCL is chosen to be something like 1e-4 or bigger then i never find a proper solution.

from numpy import *
from scipy.optimize import *

K_1    = 1e-8
K_2    = 1e-8 
K_W    = 1e-30
C_HCL  = 1e-11
C_NAOH = K_W/C_HCL
C_HL   = 1e-6

if C_HCL-C_NAOH > 0:
    Saeure_Base = C_HCL-C_NAOH+sqrt(K_W)
    OH_init = K_W/(Saeure_Base)
elif C_HCL-C_NAOH < 0:
    OH_init = C_NAOH-C_HCL+sqrt(K_W)
    Saeure_Base = K_W/OH_init

# some randomized start parameters    
G1 = random.uniform(0, 2)*Saeure_Base
G2 = random.uniform(0, 2)*OH_init
G3 = random.uniform(1, 2)*C_HL*(sqrt(K_W))/(Saeure_Base+OH_init)
G4 = random.uniform(0.1, 1)*(C_HL - G3)/2
G5 = C_HL - G3 - G4
zGuess = array([G1,G2,G3,G4,G5])

#equation system / 5 variables --> H3O, OH, HL, H2L, L
def myFunction(z):

    H3O   = z[0]
    OH    = z[1]
    HL    = z[2]
    H2L   = z[3]
    L     = z[4]

    F = empty((5))

    F[0] = H3O*L/HL  - K_1
    F[1] = OH*H2L/HL - K_2
    F[2] = K_W - OH*H3O
    F[3] = C_HL - HL - H2L - L
    F[4] = OH+L+C_HCL-H2L-H3O-C_NAOH
    return F

z = fsolve(myFunction,zGuess, maxfev=10000, xtol=1e-15, full_output=1,factor=0.1)

print z

So the questions are. Is this problem based on the precision of float.64 and if yes , (how) can it be solved with python? Is fsolve the way to go? Would i need to change the fsolve function so it accepts a different data type?

Upvotes: 1

Views: 4024

Answers (1)

Coding thermodynamist
Coding thermodynamist

Reputation: 1403

The root of your problem is either theoretical or numerical.

The scipy.optimize.fsolvefunction is based on the MINPACK Fortran solver (http://www.netlib.org/minpack/). This solver use a Newton-Raphson optimisation algorithm to provide the solution.

There are underlying assumptions about the smoothness of the function when you use this algorithm. For example, the jacobian matrix at the solution point x is supposed to be invertible. The one you are more concerned about is the basins of attraction. In order to converge, the starting point of the algorithm needs to be near the actual solution, i.e. in the basins of attraction. This condition is always met for convex functions, however it is easy to find some functions for which this algorithm behaves badly. Your function is one of this as you have a fraction of your inputs parameters.

To address this issue you should just change the starting point. This starting point becomes also very important for functions with multiple solutions: this picture from the wikipedia article shows you the solution found depending of the starting point (five colours for five solutions); so you should be careful with your solution and actually check the "physical" aspects of your solution.

For the numerical aspects, the Newton-Raphson algorithm needs to have the value of the jacobian matrix (the derivatives matrix). If it is not provided to the MINPACK solver, the jacobian is estimated with a finite-difference formula. The perturbation step for the finite difference formula need to be provided epsfcn=None, the None being here as default value only in the case where fprimeis provided (there is no need for the jacobian estimation in this case). So first you should incorporate that. You could also specify directly the jacobian by derivating your function by hand.

However, the minimum value for the step size will be the machine precision, also called machine epsilon. For your problem, you have very small inputs values which can be a problem. I would suggest multiply everyone of them by the same value (like 10^6), it is equivalent to a change of the units but will avoid rounding up errors and problems with machine precision.

This problem is also important when you look at the parameter xtol=1e-15 you provided. In your error message, it gives xtol=0.000000, as it is below machine precision and cannot be taken into account. Also, if you look at your line F[2] = K_W - OH*H3O, given the machine precision, it does not matter if K_W is 1e-15or 1e-30. 0 is a solution for both of this case compare to the machine precision. To avoid this problem, just multiply everything by a bigger value.

So to sum up:

  1. For the Newton-Raphson algorithm, the initialisation point matters !
  2. For this algorithm, you should specify how you compute the jacobian !
  3. In numerical computation, never work with small values. You can easily change the dimension to something different: it is basic units conversion, like working in gram instead of kilogram.

Upvotes: 5

Related Questions