Reputation: 768
When searching for the best-fit distribution for my dataset, the result was the Exponentially Modified Normal distribution with the following parameters:
K=10.84, loc=154.35, scale=73.82
Scipy gives us a way to analyze the mean of the distribution by:
fitted_mean = scipy.stats.exponnorm.stats(K=10.84, loc=154.35, scale=73.82, moments='mean')
The resulting fitted_mean=984, which is the same mean as my dataset. However, I'm not sure what this is telling me. I thought the loc=154.35 is the mean of the distribution.
What are these two means? If I fitted the data with the best distribution, isn't the fitted_mean (154.35) the new and only mean?
Upvotes: 0
Views: 696
Reputation: 114911
For the exponentially modified normal distribution, the location parameter is not the same as the mean. This is true for many distributions.
Take a look at the wikipedia page for the exponentially modified Gaussian distribution. This is the same distribution as scipy.stats.exponnorm
, but with a different parameterization. The mapping of the parameters between the wikipedia version and scipy is:
μ = loc
σ = scale
λ = 1/(K*scale)
The wikipedia page says the mean of the distribution is μ + 1/λ, which, in terms of the scipy parameters, is loc + K*scale
.
When you fit the distribution to your data, you found
loc = 154.35
scale = 73.82
K = 10.84
The formula for the mean from the wikipedia page gives
loc + K*scale = 954.5587999999999
Here is the calculation using exponnorm
:
In [16]: fitted_mean = scipy.stats.exponnorm.stats(K=10.84, loc=154.35, scale=73.82, moments='mean')
In [17]: fitted_mean
Out[17]: array(954.5587999999999)
which matches the result from the wikipedia formula.
(You reported fitted_mean = 984
, but I assume that was a typographical error.)
Upvotes: 3