Reputation: 515
How do I sample from a normal distribution that is clipped?
I want to sample from N(0, 1)
. But I want the values to be from [-1, +1]
. I cannot apply np.clip
as that would increase the probability of -1
and +1
. I can do stochastic clipping, but then there's no guarantee that it'll fall out of the range.
#standard
s = np.random.normal(0, 1, [10,10])
s = np.clip(s)
#stochastic
for j in range(10)
edge1 = np.where(s[j] >= 1.)[0]
edge2 = np.where(s[j] <= -1)[0]
if edge1.shape[0] > 0:
rand_el1 = np.random.normal(0, 1, size=(1, edge1.shape[0]))
s[j,edge1] = rand_el1
if edge2.shape[0] > 0:
rand_el2 = np.random.normal(0, 1, size=(1, edge2.shape[0]))
s[j,edge2] = rand_el2
Upvotes: 1
Views: 2051
Reputation: 114976
The scipy library implements the truncated normal distribution as scipy.stats.truncnorm
. In your case, you can use sample = truncnorm.rvs(-1, 1, size=sample_size)
.
For example,
In [55]: import matplotlib.pyplot as plt
In [56]: from scipy.stats import truncnorm, norm
Sample 100000 points from the normal distribution truncated to [-1, 1]:
In [57]: sample = truncnorm.rvs(-1, 1, size=100000)
Make a histogram, and plot the theoretical PDF curve. The PDF can be computed with truncnorm.pdf
, or with a scaled version of norm.pdf
.
In [58]: _ = plt.hist(sample, bins=51, normed=True, facecolor='g', edgecolor='k', alpha=0.4)
In [59]: x = np.linspace(-1, 1, 101)
In [60]: plt.plot(x, truncnorm.pdf(x, -1, 1), 'k', alpha=0.4, linewidth=5)
Out[60]: [<matplotlib.lines.Line2D at 0x11f78c160>]
In [61]: plt.plot(x, norm.pdf(x)/(norm.cdf(1) - norm.cdf(-1)), 'k--', linewidth=1)
Out[61]: [<matplotlib.lines.Line2D at 0x11f779f60>]
Here's the plot:
Upvotes: 2
Reputation: 221
I believe that the simplest (perhaps not most efficient) way to do so is to use basic Rejection sampling. It simply consists in simulating values from a N(0,1)
, rejecting those that fall out of the wanted bounds and keeping the others until they stack to the wanted number of samples.
kept = []
while len(kept) < 1000:
s = np.random.normal()
if -1 <= s <= 1:
kept.append(s)
Here I stack things in a basic list ; feel free to use a np.array and replace the length condition with one based on the array's dimensions.
Upvotes: 2
Reputation: 114578
Do the stochiastic clipping iteratively until you don't need it any more. This basically means turning your if
s into a while
. You can also take this opportunity to simplify the out-of bounds condition into a single check rather than a separate check on each side:
s = np.random.normal(0, 1, (10, 10))
while True:
out_of_bounds = np.abs(s) > 1
count = np.count_nonzero(out_of_bounds)
if count:
s[out_of_bounds] = np.random.normal(0, 1, count)
else:
break
Upvotes: 1