Reputation: 45
I have been learning about random sampling methods and am aware that Numpy uses Mersenne-Twister to generate uniform random numbers, how does it then pass these to generate non-uniform distributions?
For example:
np.random.normal(mu,sigma,n)
What algorithm is used here to sample normally distributed numbers? Thanks.
Upvotes: 4
Views: 1580
Reputation: 19855
Your overall question is too broad, it could (and does) fill an entire textbook.
That said, a very quick overview is that techniques for generating non-uniform random numbers fall into a few common categories. These include:
A simple example of each of 1-3 & 5 can be found in Section 4.3 of this tutorial paper.
In practice, a combination of the techniques is often used.
For instance, the normal distribution cannot be found analytically by inversion because that would require being able to write a closed-form equation for the CDF.
Two popular variants for generating normals look at pairs of normals in polar coordinates, i.e., expressed as a direction and a distance. The basic Box-Muller algorithm notes that the direction is uniform from 0 to 2π, and Pythagoras tells us the distance is based on the sum of the squared normals, which has a chi-square(2) distribution (convolution). "Special relations" tell us that a chi-square(2) is an exponential(2), which is easy to generate via inversion. Pulling all the pieces together and converting back to Cartesian coordinates gives the pair of formulas found in the Wikipedia article.
The second variant is Marsaglia's Polar method, which appears to be the method used by NumPy. It avoids the evaluation of sine/cosine transcendental functions by generating points randomly in a square, and rejecting any that aren't contained within a circumscribed circle (acceptance/rejection). It then scales the results using the same chi-square/exponential distance calculation, so it's also drawing on convolution, "special relationships, " and inversion.
The fastest approaches are based on the ziggurat algorithm, which breaks the normal into layers (composition), uses special relations for some of the layers, and uses acceptance/rejection to deal with the tails of a layer.
Upvotes: 4