I have a set of data points, (x and y in the code below) and I am trying to create a linear line of best fit through my points. I am using scipy.optimize.curve_fit . My code produces a line, but not a line of best fit. I have tried giving the function model parameters to use for my gradient and for my intercept, but each time it produces the exact same line which does not fit to my data points. The blue dots are my data points the red line should be fitted to: If anyone could point out where I am going wrong I would be extremely grateful: import numpy as np import matplotlib.pyplot as mpl import scipy as sp import scipy.optimize as opt x=[1.0,2.5,3.5,4.0,1.1,1.8,2.2,3.7] y=[6.008,15.722,27.130,33.772,5.257,9.549,11.098,28.828] trialX = np.linspace(1.0,4.0,1000) #Trial values of x def f(x,m,c): #Defining the function y(x)=(m*x)+c return (x*m)+c popt,pcov=opt.curve_fit(f,x,y) #Returning popt and pcov ynew=f(trialX,*popt) mpl.plot(x,y,'bo') mpl.plot(trialX,ynew,'r-') mpl.show()

Reputation: 93

Why does scipy.optimize.curve_fit not produce a line of best fit for my points?

I have a set of data points, (x and y in the code below) and I am trying to create a linear line of best fit through my points. I am using scipy.optimize.curve_fit. My code produces a line, but not a line of best fit. I have tried giving the function model parameters to use for my gradient and for my intercept, but each time it produces the exact same line which does not fit to my data points.

The blue dots are my data points the red line should be fitted to:

enter image description here

If anyone could point out where I am going wrong I would be extremely grateful:

import numpy as np
import matplotlib.pyplot as mpl
import scipy as sp
import scipy.optimize as opt

x=[1.0,2.5,3.5,4.0,1.1,1.8,2.2,3.7]
y=[6.008,15.722,27.130,33.772,5.257,9.549,11.098,28.828]
trialX = np.linspace(1.0,4.0,1000)                         #Trial values of x

def f(x,m,c):                                        #Defining the function y(x)=(m*x)+c
    return (x*m)+c

popt,pcov=opt.curve_fit(f,x,y)                       #Returning popt and pcov
ynew=f(trialX,*popt)                                                  

mpl.plot(x,y,'bo')
mpl.plot(trialX,ynew,'r-')
mpl.show()

Upvotes: 8

Answers (2)

Hooked

Reputation: 88148

EDIT: This behavior has now been patched in the current version of scipy to make .curve_fit a bit more foolproof:

https://github.com/scipy/scipy/issues/3037

For some reason, .curve_fit really wants the input to be a numpy array and will give you erroneous results if you pass it a regular list (IMHO this is unexpected behavior and may be a bug). Change the definition of x to:

x=np.array([1.0,2.5,3.5,4.0,1.1,1.8,2.2,3.7])

And you get:

enter image description here

I'm guessing that the happens since m*x where m an integer and x is a list will produce m copies of that list, clearly not the result you were looking for!

Upvotes: 3

dnf0

Reputation: 1649

You could alternatively use numpy.polyfit to get the line of best fit:

import numpy as np
import matplotlib.pyplot as mpl

x=[1.0,2.5,3.5,4.0,1.1,1.8,2.2,3.7]
y=[6.008,15.722,27.130,33.772,5.257,9.549,11.098,28.828]
trialX = np.linspace(1.0,4.0,1000)                         #Trial values of x

#get the first order coefficients 
fit = np.polyfit(x, y, 1)

#apply 
ynew = trialX * fit[0] + fit[1]                                              

mpl.plot(x,y,'bo')
mpl.plot(trialX,ynew,'r-')
mpl.show()

Here is the output: enter image description here

Upvotes: 6

Why does scipy.optimize.curve_fit not produce a line of best fit for my points?

Answers (2)

Related Questions