kb3hts
kb3hts

Reputation: 165

Curvefitting optimization error when fitting piecewise linear function

I have some data in two arrays, which appears to have a break in it. I want my code to figure out where the break is with using piecewise in scipy. Here is what I have:

from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

x = np.array([7228,7620,7730,7901,8139,8370,8448,8737,8824,9089,9233,9321,9509,9568,9642,9756,9915,10601,10942], dtype=np.float)
y= np.array([.874,.893,.8905,.8916,.9095,.9142,.9109,.9185,.9169,.9251,.9290,.9304,.9467,.9378,0.9464,0.9508,0.9583,0.9857,0.9975],dtype=np.float)

def piecewise_linear(x, x0, y0, k1, k2):
    return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])

p , e = optimize.curve_fit(piecewise_linear, x, y)
perr = np.sqrt(np.diag(e))
xd = np.linspace(7228, 11000, 3000)
plt.plot(x, y, "o")
plt.plot(xd, piecewise_linear(xd, *p))

My issue is if I run this, I get an error, "OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning)". Not sure how to get around this? Is there maybe a way to feed initial parameters into this function to help it converge or similar?

Note, I do realize that the other way I could be getting this to work is interpolating and finding the second derivative of my data. I've already done this, but because my data is not evenly spaced/ the y axis data has some error in it I am interested in getting it to work this way as well for statistical purposes. So, to be clear, what I want here are the parameters for the two lines (slope/intercept), and the inflection point. (Ideally I would love to get an error too on these too, but not sure if that's possible with this method.) Thanks in advance!

Upvotes: 2

Views: 397

Answers (2)

JJacquelin
JJacquelin

Reputation: 1705

Probably your software uses an iterative method starting from an initial guess. Generally the initial guess is the weakness of those methods.

If you want to overcome this kind of difficulty, use a non iterative method which don't require an initial guess. If the criteria of fitting of the non iterative method is not convenient for you, nevertheless first use the non iterative method to obtain a first solution. Then use a classical iterative method, starting from the solution found first.

For example, the next result is obtained thanks to the very simple algorithm (not iterative, no initial guess) which is given pp. 12-13in the paper : https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf

enter image description here

Upvotes: 0

MB-F
MB-F

Reputation: 23637

The code works perfectly fine, only the initial values are causing problems.

By default curve_fit starts with all parameters set to 1. Thus, x0 starts way out of range of the x in your data and the optimizer cannot compute a sensible gradient. This small modification will fix the issue:

# make sure initial x0 and y0 are in range of the data
p0 = [np.mean(x), np.mean(y), 1, 1]

p , e = optimize.curve_fit(piecewise_linear, x, y, p0)  # set initial parameter estimates
perr = np.sqrt(np.diag(e))
xd = np.linspace(7228, 11000, 3000)
plt.plot(x, y, "o")
plt.plot(xd, piecewise_linear(xd, *p))

print(p)  # [  9.32099947e+03   9.32965835e-01   2.58225121e-05   4.05400820e-05]
print(np.diag(e))  # [  4.56978067e+04   5.52060368e-05   3.88418404e-12   7.05010755e-12]

enter image description here

Upvotes: 2

Related Questions