Fitting and Plotting Lognormal

Question

I'm having trouble doing something as relatively simple as:

Draw N samples from a gaussian with some mean and variance
Take logs to those N samples
Fit a lognormal (using stats.lognorm.fit)
Spit out a nice and smooth lognormal pdf without inf values (using stats.lognorm.pdf)

Here's a small working example of the output I'm getting:

from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import math

%matplotlib inline


def lognormDrive(mu,variance):
    size = 1000
    sigma = math.sqrt(variance)
    np.random.seed(1)
    gaussianData = stats.norm.rvs(loc=mu, scale=sigma, size=size)
    logData = np.exp(gaussianData)
    shape, loc, scale = stats.lognorm.fit(logData, floc=mu)
    return stats.lognorm.pdf(logData, shape, loc, scale)

plt.plot(lognormDrive(37,0.8))

And as you might notice, the plot makes absolutely no sense.

Any ideas?

I've followed these posts: POST1 POST2

Thanks in advance!

Elaboration: I am building a small script that will

Take raw data and fit a kernel distribution (emperical dist.)
Assume different distributions given the mean and variance of the data. This would be a gaussian and a lognormal
Plot those distributions together with the emperical dist using interact
Calculate the Kullbeck-Leibler divergence between the different distributions when one turns the knob for the mean and variance (and skew eventually)

Warren Weckesser · Accepted Answer

In the call to lognorm.fit(), use floc=0, not floc=mu.

(The location parameter of the lognorm distribution simply translates the distribution. You almost never want to do that with the log-normal distribution.)

See A lognormal distribution in python

By the way, you are plotting the PDF of the unsorted sample values, so the plot in the corrected script won't look much different. You might find it more useful to plot the PDF against the sorted values. Here's a modification of your script that creates a plot of the PDF using the sorted samples:

from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import math


def lognormDrive(mu,variance):
    size = 1000
    sigma = math.sqrt(variance)
    np.random.seed(1)
    gaussianData = stats.norm.rvs(loc=mu, scale=sigma, size=size)
    logData = np.exp(gaussianData)
    shape, loc, scale = stats.lognorm.fit(logData, floc=0)
    print "Estimated mu:", np.log(scale)
    print "Estimated var: ", shape**2
    logData.sort()
    return logData, stats.lognorm.pdf(logData, shape, loc, scale)

x, y = lognormDrive(37, 0.8)
plt.plot(x, y)
plt.grid()
plt.show()

The script prints:

Estimated mu: 37.0347152587
Estimated var:  0.769897988163

and creates the following plot:

Fitting and Plotting Lognormal

Answers (1)

Related Questions