Reputation: 9756
I do not understand why the following code works with the normal function and do not for another custom function :
That is the example where I tried to sample the normal distribution :
n = 100000
xx = np.random.uniform(-5, 5, n)
rho = mpl.pylab.normpdf(xx, 0, 1)
rnd = np.random.rand(n)
ix = np.where(rho > rnd)
xx = xx[ix]
h = plt.hist(xx, bins=20, normed=True)
# plot density
x = np.linspace(-5, 5, 100)
plt.plot(x, mpl.pylab.normpdf(x, 0, 1))
It works and I got :
Now if I changed the density, I do not correctly sample it. I checked if the density is well normed and it is. Thus I do not understand where I am wrong
n = 100000
xx = np.random.uniform(0, 1, n)
rho = 2 * np.sin(2 * xx * np.pi)**2
rnd = np.random.rand(n)
ix = np.where(rho > rnd)
xx = xx[ix]
h = plt.hist(xx, bins=20, normed=True)
# plot density
x = np.linspace(0, 1, 100)
print(np.trapz(2 * np.sin(2 * x * np.pi)**2, x))
plt.plot(x, 2 * np.sin(2 * x * np.pi)**2)
Upvotes: 3
Views: 128
Reputation: 20080
You're rejecting too much in first example, and not enough in the second.
Optimal case when you're sampling Y
from 0 to PDFmax.
In first case, you shall call
rnd = np.random.rand(n) / np.sqrt(2.0 * np.pi)
In second case
rnd = 2.0 * np.random.rand(n)
Upvotes: 2
Reputation: 51998
You are doing rejection sampling
In the first case the max value of the pdf is < 1, and you are drawing rnd
from [0,1]
, so all the values are below the max. You are throwing away more values than needed though, since the max is strictly less than 1. In the second case the max of the pdf is 2 but you are still drawing rnd
from [0,1]
in the line
rnd = np.random.rand(n)
You should change that line so it samples uniformly from [0,2]
. Note that the somewhat flat tops of your histograms correspond to the parts of [0,1]
where the pdf is > 1. Your code has no way of treating some of those values differently than others.
Upvotes: 2