Reputation: 165
I have some data in two arrays, which appears to have a break in it. I want my code to figure out where the break is with using piecewise in scipy. Here is what I have:
from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
x = np.array([7228,7620,7730,7901,8139,8370,8448,8737,8824,9089,9233,9321,9509,9568,9642,9756,9915,10601,10942], dtype=np.float)
y= np.array([.874,.893,.8905,.8916,.9095,.9142,.9109,.9185,.9169,.9251,.9290,.9304,.9467,.9378,0.9464,0.9508,0.9583,0.9857,0.9975],dtype=np.float)
def piecewise_linear(x, x0, y0, k1, k2):
return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])
p , e = optimize.curve_fit(piecewise_linear, x, y)
perr = np.sqrt(np.diag(e))
xd = np.linspace(7228, 11000, 3000)
plt.plot(x, y, "o")
plt.plot(xd, piecewise_linear(xd, *p))
My issue is if I run this, I get an error, "OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning)". Not sure how to get around this? Is there maybe a way to feed initial parameters into this function to help it converge or similar?
Note, I do realize that the other way I could be getting this to work is interpolating and finding the second derivative of my data. I've already done this, but because my data is not evenly spaced/ the y axis data has some error in it I am interested in getting it to work this way as well for statistical purposes. So, to be clear, what I want here are the parameters for the two lines (slope/intercept), and the inflection point. (Ideally I would love to get an error too on these too, but not sure if that's possible with this method.) Thanks in advance!
Upvotes: 2
Views: 397
Reputation: 1705
Probably your software uses an iterative method starting from an initial guess. Generally the initial guess is the weakness of those methods.
If you want to overcome this kind of difficulty, use a non iterative method which don't require an initial guess. If the criteria of fitting of the non iterative method is not convenient for you, nevertheless first use the non iterative method to obtain a first solution. Then use a classical iterative method, starting from the solution found first.
For example, the next result is obtained thanks to the very simple algorithm (not iterative, no initial guess) which is given pp. 12-13in the paper : https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf
Upvotes: 0
Reputation: 23637
The code works perfectly fine, only the initial values are causing problems.
By default curve_fit
starts with all parameters set to 1. Thus, x0 starts way out of range of the x
in your data and the optimizer cannot compute a sensible gradient.
This small modification will fix the issue:
# make sure initial x0 and y0 are in range of the data
p0 = [np.mean(x), np.mean(y), 1, 1]
p , e = optimize.curve_fit(piecewise_linear, x, y, p0) # set initial parameter estimates
perr = np.sqrt(np.diag(e))
xd = np.linspace(7228, 11000, 3000)
plt.plot(x, y, "o")
plt.plot(xd, piecewise_linear(xd, *p))
print(p) # [ 9.32099947e+03 9.32965835e-01 2.58225121e-05 4.05400820e-05]
print(np.diag(e)) # [ 4.56978067e+04 5.52060368e-05 3.88418404e-12 7.05010755e-12]
Upvotes: 2