Scipy minimize, fmin, leastsq type problems (setting array element with sequence), bad fit

I'm having trouble with the scipy.optimize.fmin and scipy.optimize.minimize functions. I've checked and confirmed that all the arguments passed to the function are of type numpy.array, as well as the return value of the error function. Also, the carreau function returns a scalar value.

The reason for some of the extra arguments, such as size, is this: I need to fit data with a given model (Carreau). The data is taken at different temperatures, which are corrected with a shift factor (which is also fitted by the model), I end up with several sets of data which should all be used to calculate the same 4 constants (parameters p).

I read that I can't pass the fmin function a list of arrays, so I had to concatenate all data into x_data_lin, keeping track of the different sets with the size parameter. t holds different test temperatures, while t_0 is a one-element array which holds the reference temperature.

I am positive (triple checked) that all the arguments passed to the function, as well as the result, are one-dimensional arrays. Here's the code aside from that:

import numpy as np
import scipy.optimize
from scipy.optimize import fmin as simplex

def err_func2(p, x, y, t, t_0, size):
    result = array([])
    temp = 0
    for i in range(0, int(len(size)-1)):
        for j in range(int(temp), int(temp+size[i])):
            result = np.append(result, (carreau(p, x[j], t[i], t_0[0])-y[i]))
        temp += size[i]
    return result

p1 = simplex(err_func2, initial_guess,
            args=(x_data_lin, y_data_lin, t_list, t_0, size), full_output=0)

Here's the error:

Traceback (most recent call last):
  File "C:\Python27\Scripts\projects\Carreau - WLF\carreau_model_fit.py", line 146, in <module>
    main()
  File "C:\Python27\Scripts\projects\Carreau - WLF\carreau_model_fit.py", line 105, in main
    args=(x_data_lin, y_data_lin, t_list, t_0, size), full_output=0)
  File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line 351, in fmin
    res = _minimize_neldermead(func, x0, args, callback=callback, **opts)
  File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line 415, in _minimize_neldermead
    fsim[0] = func(x0)
ValueError: setting an array element with a sequence.

It's worth noting that I got the leastsq function working while passing it lists of arrays. Unfortunately, it did a poor job of fitting the data. But, as it took me a lot of time and research to get to that point, I'll post the code as follows. If somebody is interested in seeing all of the code, I would gladly post it, if you can recommend me somewhere to upload a few files(as it includes another imported script and of course sample data):

##def error_function(p, x, y, t, t_0):
##    result = array([])
##    for index in range(len(x)):
##        result = np.append(result,(carreau(p, x[index],
##                                           t[index], t_0) - y[index]))
##    return result

##    p1, success = scipy.optimize.leastsq(error_function, initial_guess,
##                                         args=(x_list, y_list, t_list, t_0),
##                                         maxfev=10000)

:( I was going to post a picture of the graphed data with the leastsq fit, but I don't have the requisite 10 points.

Late Edit: I now have gotten optimize.curvefit and optimize.leastsq to work (which probably not-so-coincidentally give the same answer), but the curve is bad. I've been trying to figure out optimize.minimize, but it's been a bit of a headache. the simplex (fmin, Nelder_Mead, whatever you want to call it) will run, but produces a crazy answer nowhere close. I've never worked with nonlinear optimization problems before, and I don't really know what direction to head.

Here's the working curve_fit code:

def temp_shift(t_s, t, t_0):
    """ This function calculates the a_t temperature shift factor for polymer
    viscosity curves. Variable is the standard temperature, t_s
    """
    C_1 = 8.86
    C_2 = 101.6
    return(np.exp(
        (C_1*(t_0-t_s) / (C_2+(t_0-t_s))) - (C_1*(t-t_s) / (C_2 + (t-t_s)))
        ))

def pass_data(t, t_0):
    def carreau_2(x, p0, p1, p2, p3):
        visc_0 = p0
        m = p1
        n = p2
        t_s = p3
        a_T = temp_shift(p3, t, t_0)
        return (visc_0 * a_T / (1 + m * x * a_T)**n)
    return carreau_2

initial_guess = array([20000, 3, 0.94, -20])

p1, conv = scipy.optimize.curve_fit(pass_data(t_all, t_0), x_data_lin,
                                   y_data_lin, initial_guess)

Here's some sample data:

x_data_lin = array([0.01998, 0.04304, 0.2004, 0.43160, 0.92870, 2.0000, 4.30900,
                    9.28500, 15.51954, 21.94936, 37.52960, 90.41786, 204.35230,
                    331.58495, 811.92250, 1694.55309, 3464.27648, 8826.65738,
                    14008.00242])   

y_data_lin = array([13520.00000, 13740.00000, 12540.00000, 9384.00000, 5201,
                    3232.00000, 2094.00000, 1484.00000, 999.00000, 1162.05088
                    942.56946, 705.62489, 429.47341, 254.15136, 185.22916, 
                    122.07113, 76.46324, 47.85064, 25.74315, 18.84875])

t_all = array([190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 
               190, 190, 190, 190, 190, 190, 190])

t_0 = 80

Here's a picture of the result of curve_fit (now that I have 10 points and can post!). Note there are 3 curves drawn because I used 3 sets of data to optimize the curve, at 3 different temperatures. Polymers have the property that the shear_rate - viscosity relationship stays the same, just shifted by a temperature factor a_T: scipy.optimize.curve_fit

I'd really appreciate any suggestions about how to improve the fit, or how to define the function so that optimize.minimize works, and which method (Nelder-Mead, Powel, BFGS) might work.

Another edit to add: I got the Nelder-Mead (optimize.fmin, and the default of optimize.minimize) function to work - I'll include the revised error function below. Before, I simply summed the result array and returned it. This led to extremely negative values (obviously, since the function's goal is to minimize). Squaring the result before summing it solved that problem. Note that I also changed the function completely to take advantage of numpy's array broadcasting, as suggested by JaminSore (Thanks Jamin!)

def err_func2(p, x, y, t, t_0):
    return ((carreau(p, x, t, t_0)-y)**2).sum()

Unfortunately, the Nelder-Mead function gives me the same result as leastsq and curve_fit. You can see in the graph above that it's not the optimal fit; in fact, at this point, Microsoft Excel's solver function is doing a better job on the data.

At least, I hope this thread can be useful for beginners to scipy.optimize in the future, since it's taken me quite awhile to discover all of this.

Upvotes: 3

Views: 6339

Answers (2)

JaminSore
JaminSore

Reputation: 3936

Unlike leastsq, fmin can only deal with error functions that return a scalar so if possible you have to rewrite your error function so that it returns a scalar. Here is a simple working example.

Import the necessary libraries

import numpy as np
from scipy.optimize import fmin

Define a helper function (you'll see later)

def prob(a, b):
    return (1 + np.exp(b - a))**-1

Simulate some data

true_ = np.random.normal(size = 100) #parameters we're trying to recover
b = np.random.normal(size = 20)

exp_ = prob(true_[:, None], b) #expected
a_s, b_s = true_.shape[0], b.shape[0]
noise = np.random.uniform(size = (a_s, b_s))
response = (noise > (1 - exp_)).astype(int)

Define our error function (I'm using lambdas but this is not recommended in practice)

# sum of the squared residuals
err_func = lambda a : ((prob(a[:, None], b) - response) ** 2).sum()
result = fmin(err_func, np.zeros_like(true_)) #solve

If I remove the .sum() at the end of my error function definition, I get the same error.

Upvotes: 3

OK, now I finally know the answer! First, the final piece, then a recap. The problem of the fit wasn't the fault of curve_fit, leastsq, Nelder_Mead, or Powell (the methods I've tried). It has to do with the relative weights of the errors. Since this data is on a log scale, the errors in the fit near the high y values are very costly, while errors near the low y values are insignificant. To correct this, I made the error relative by dividing by the y value of the data, as follows:

def err_func2(p, x, y, t, t_0):
    return (((carreau(p, x, t, t_0)-y)/y)**2).sum()

Now, each relative error is squared, summed, then minimized, giving the following fit (using optimize.minimize with the Powell method, although it should be the same for the other methods as well.)

Finally, a good fit

So now a recap of the answers found in this thread:

  • The easiest way (or at least for me, most fool-proof) to deal with curve fitting is to collect all the data into 1D numpy.arrays. Then, you can rely on numpy's array broadcasting to perform all operations. This means that arithmetic operations are treated the same way a vector dot-product would be. For example, array_1 = [a,b], array_2 = [c,d], then array_1 + array_2 = [a+c, b+d]. This works for addition, subtraction, multiplication, division, and powers: array+1array_2 = [ac, b**d].

  • For the optimize.leastsq function, you need to let the objective function return an array; i.e. return result where result is an array. For optimize.curve_fit, you also return an array. In this case, it's a bit more complicated to pass extra arguments (think other constants), but you can do it with a nested function, as I demonstrated above in the pass_data function.

  • For optimize.minimize, you need to return a scalar - that is, a single number. You can also return an array of answers, I think, but I avoided this by getting all the data into 1D arrays, as I mentioned earlier. To get this scalar, you can simply square and sum the result (like I have written in this post under err_func2) Squaring the data is very important, otherwise negative errors take over and drive the resulting scalar extremely negative.

  • Finally, as mentioned, when your data crosses several scales (105, 104, 10**3, etc), it may be necessary to normalize the errors. I did this by dividing each error by the y value.

So... I guess that's it? Finally?

Upvotes: 1

Related Questions