Reputation: 1553
I'm having trouble doing something as relatively simple as:
Here's a small working example of the output I'm getting:
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import math
%matplotlib inline
def lognormDrive(mu,variance):
size = 1000
sigma = math.sqrt(variance)
np.random.seed(1)
gaussianData = stats.norm.rvs(loc=mu, scale=sigma, size=size)
logData = np.exp(gaussianData)
shape, loc, scale = stats.lognorm.fit(logData, floc=mu)
return stats.lognorm.pdf(logData, shape, loc, scale)
plt.plot(lognormDrive(37,0.8))
And as you might notice, the plot makes absolutely no sense.
Any ideas?
I've followed these posts: POST1 POST2
Thanks in advance!
Elaboration: I am building a small script that will
Upvotes: 0
Views: 7134
Reputation: 114976
In the call to lognorm.fit()
, use floc=0
, not floc=mu
.
(The location parameter of the lognorm
distribution simply translates the distribution. You almost never want to do that with the log-normal distribution.)
See A lognormal distribution in python
By the way, you are plotting the PDF of the unsorted sample values, so the plot in the corrected script won't look much different. You might find it more useful to plot the PDF against the sorted values. Here's a modification of your script that creates a plot of the PDF using the sorted samples:
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import math
def lognormDrive(mu,variance):
size = 1000
sigma = math.sqrt(variance)
np.random.seed(1)
gaussianData = stats.norm.rvs(loc=mu, scale=sigma, size=size)
logData = np.exp(gaussianData)
shape, loc, scale = stats.lognorm.fit(logData, floc=0)
print "Estimated mu:", np.log(scale)
print "Estimated var: ", shape**2
logData.sort()
return logData, stats.lognorm.pdf(logData, shape, loc, scale)
x, y = lognormDrive(37, 0.8)
plt.plot(x, y)
plt.grid()
plt.show()
The script prints:
Estimated mu: 37.0347152587
Estimated var: 0.769897988163
and creates the following plot:
Upvotes: 2