Python Numpy Random Numbers - inconsistent?

Question

I am trying to generate log-normally distributed random numbers in python (for later MC simulation), and I find the results to be quite inconsistent when parameters are a bit larger.

Below I am generating a series of LogNormals from Normals (and then using Exp) and directly from LogNormals. The resulting means are bearable, but the variances - quite imprecise.. this also holds for mu = 4,5,...

If you re-run the below code a couple of times - the results come back quite different.

Code:

import numpy as np
mu = 10;
tmp1 = np.random.normal(loc=-mu, scale=np.sqrt(mu*2),size=1e7)
tmp1 = np.exp(tmp1)
print tmp1.mean(), tmp1.var()
tmp2 = np.random.lognormal(mean=-mu, sigma=np.sqrt(mu*2), size=1e7)
print tmp2.mean(), tmp2.var()
print 'True Mean:', np.exp(0), 'True Var:',(np.exp(mu*2)-1)

Any advice how to fix this? I've tried this also on Wakari.io - so the result is consistent there as well

Update: I've taken the 'True' Mean and Variance formula from Wikipedia: https://en.wikipedia.org/wiki/Log-normal_distribution

Snapshots of results: 1)

0.798301881219 57161.0894726
1.32976988569 2651578.69947
True Mean: 1.0 True Var: 485165194.41

2)

1.20346203176 315782.004309
0.967106664211 408888.403175
True Mean: 1.0 True Var: 485165194.41

3) Last one with n=1e8 random numbers

1.17719369919 2821978.59163
0.913827160458 338931.343819
True Mean: 1.0 True Var: 485165194.41

Robert Kern · Accepted Answer

Even with the large sample size that you have, with these parameters, the estimated variance is going to change wildly from run to run. That's just the nature of the fat-tailed lognormal distribution. Try running the np.exp(np.random.normal(...)).var() several times. You will see a similar swing of values as np.random.lognormal(...).var().

In any case, np.random.lognormal() is just implemented as np.exp(np.random.normal()) (well, the C equivalent).

Python Numpy Random Numbers - inconsistent?

Answers (2)

Related Questions