Reputation: 121
Why does the following produce an incorrect output for muf and stdf?
import numpy as np
from scipy.stats import norm
x=np.linspace(-50,50,100)
sig=10
mu=0
y=1/np.sqrt(2*sig*sig*np.pi)*np.exp(-(x-mu)*(x-mu)/(2*sig*sig))
muf, stdf = norm.fit(y)
print muf,stdf
This prints
0.00989999568634 0.0134634293279
Thanks.
Upvotes: 2
Views: 1915
Reputation: 23647
You misunderstood the purpose of norm.fit
. It does not fit a Gaussian to a curve but fits a normal distribution to data:
np.random.seed(42)
y = np.random.randn(10000) * sig + mu
muf, stdf = norm.fit(y)
print(muf, stdf)
# -0.0213598336843 10.0341220613
You can use curve_fit
to match the Normal distribution's parameters to a given curve, as it has been attempted originally in the question:
import numpy as np
from scipy.stats import norm
from scipy.optimize import curve_fit
x=np.linspace(-50,50,100)
sig=10
mu=0
y=1/np.sqrt(2*sig*sig*np.pi)*np.exp(-(x-mu)*(x-mu)/(2*sig*sig))
(muf, stdf), covf = curve_fit(norm.pdf, x, y, p0=[0, 1])
print(muf, stdf)
# 2.4842347485e-08 10.0000000004
Upvotes: 3
Reputation: 339695
The documentation of scipy.stats.norm
says for its fit
function
fit(data, loc=0, scale=1)
Parameter estimates for generic data.
To me this is highly ununderstandable and I'm pretty sure that one cannot expect this function to return a fit in the usual sense.
However, to fit a gaussian is rather straight forward:
from __future__ import division
import numpy as np
x=np.linspace(-50,50,100)
sig=10
mu=0
y=1/np.sqrt(2*sig*sig*np.pi)*np.exp(-(x-mu)*(x-mu)/(2*sig*sig)) #
def gaussian_fit(xdata,ydata):
mu = np.sum(xdata*ydata)/np.sum(ydata)
sigma = np.sqrt(np.abs(np.sum((xdata-mu)**2*ydata)/np.sum(ydata)))
return mu, sigma
print gaussian_fit(x,y)
This prints
(-7.474196315587989e-16, 9.9999422983567516)
which is sufficiently close to the expected values of (0, 10)
.
Upvotes: 3