What is the point of norm.fit in scipy?

Im generating a random sample of data and plotting its pdf using scipy.stats.norm.fit to generate my loc and scale parameters.

I wanted to see how different my pdf would look like if I just calculated the mean and std using numpy without any actual fitting. To my surprise when I plot both pdfs and print both sets of mu and std the results I get are exactly the same. So my question is, what is the point of norm.fit if I can just calculate the mean and std of my sample and still get the same results?

This is my code:

import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

data = norm.rvs(loc=0,scale=1,size=200)

mu1 = np.mean(data)

std1 = np.std(data)

print(mu1)
print(std1)

mu, std = norm.fit(data)

plt.hist(data, bins=25, density=True, alpha=0.6, color='g')

xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mu, std)
q = norm.pdf(x, mu1, std1)
plt.plot(x, p, 'k', linewidth=2)
plt.plot(x, q, 'r', linewidth=1)
title = "Fit results: mu = %.5f,  std = %.5f" % (mu, std)
plt.title(title)

plt.show()

And this is the results I got:

Pdf of a random set of values

mu1 = 0.034824979915482716

std1 = 0.9945453455908072

Upvotes: 8

Views: 21460

Answers (1)

Arya McCarthy
Arya McCarthy

Reputation: 8814

The point is that there are several other distributions out there besides the normal distribution. Scipy provides a consistent API for learning the parameters of these distributions from data. (Want an exponential distribution instead of a normal distribution? It’s scipy.stats.expon.fit.)

So sure, your way also works because the parameters of the normal distribution happen to be the mean and standard deviation. But this is about providing a consistent interface across distributions, including ones where that’s not true.

Upvotes: 8

Related Questions