J.schmidt
J.schmidt

Reputation: 719

Excel VBA understanding the randomize statement

I am working on a little program which generates standard, normally distributed numbers, given a source of uniformly distributed random numbers. Therefore, I need to generate a bunch of random numbers. I decided to use the RND-Function as the program should be as fast is possible (so there is no extra seed function I'd like to use).

Doing some research I found the RND-Function works more efficiently using the Randomize statement immediately before. I don't understand the description of the optional number argument. I understood if I don't give the Randomize function any argument it'll use the system timer value as new seed value.

Can anyone explain to me what the optional number is actually doing with the function? Is there a difference between using Randomize(1) and Randomize(99) or even Randomize("blabla")? I'd like to understand the theory behind this optional input number. Thank You!

enter image description here

Upvotes: 3

Views: 1562

Answers (2)

Michał Turczyn
Michał Turczyn

Reputation: 37430

Seed is used to initialize a pseudorandom number generator. Basically, seed is used to generate pseudorandom numbers, you can think of it as starting point to generating random numbers. If seed is changing, randomness of numbers increases, that's why default use is to use current system time (as it is changing continuously).

From remarks on MSDN article you posted:

Randomize uses number to initialize the Rnd function's random-number generator, giving it a new seed value. If you omit number, the value returned by the system timer is used as the new seed value.

So, if you specify the argument, you will have always the same seed, thus decreasing randomness.

If Randomize is not used, the Rnd function (with no arguments) uses the same number as a seed the first time it is called, and thereafter uses the last generated number as a seed value.

Here we use last random number generated as seed, which increases randomness.

Upvotes: 2

eirikdaude
eirikdaude

Reputation: 3255

To quote from a very similar question on CrossValidated

Most pseudo-random number generators (PRNGs) are build (sic) on algorithms involving some kind of recursive method starting from a base value that is determined by an input called the "seed". The default PRNG in most statistical software (R, Python, Stata, etc.) is the Mersenne Twister algorithm MT19937, which is set out in Matsumoto and Nishimura (1998). This is a complicated algorithm, so it would be best to read the paper on it if you want to know how it works in detail. In this particular algorithm, there is a recurrence relation of degree $n$, and your input seed is an initial set of vectors x0, x1, ..., xn-1. The algorithm uses a linear recurrence relation that generates:

xn+k = f(xk, xk+1, xk+m, r, A)

where 1 <= m <= n and r and A are objects that can be specified as parameters in the algorithm. Since the seed gives the initial set of vectors (and given other fixed parameters for the algorithm), the series of pseudo-random numbers generated by the algorithm is fixed. If you change the seed then you change the initial vectors, which changes the pseudo-random numbers generated by the algorithm. This is, of course, the function of the seed.

Now, it is important to note that this is just one example, using the MT19937 algorithm. There are many PRNGs that can be used in statistical software, and they each involve different recursive methods, and so the seed means a different thing (in technical terms) in each of them. You can find a library of PRNGs for R in this documentation, which lists the available algorithms and the papers that describe these algorithms.

The purpose of the seed is to allow the user to "lock" the pseudo-random number generator, to allow replicable analysis. Some analysts like to set the seed using a true random-number generator (TRNG) which uses hardware inputs to generate an initial seed number, and then report this as a locked number. If the seed is set and reported by the original user then an auditor can repeat the analysis and obtain the same sequence of pseudo-random numbers as the original user. If the seed is not set then the algorithm will usually use some kind of default seed (e.g., from the system clock), and it will generally not be possible to replicate the randomisation.

As your quote in the question shows, the VBA randomize function will set a new seed for the RND-function, either using the system time as the seed or if you provide an argument for the function, it will use that number as the new seed for RND. If you don't call the Randomize function before calling the RND-function, the RND-function uses the previous number from RND as the new seed, so you may keep getting the same sequence of numbers.

I also recommend having a look at this answer.

Upvotes: 2

Related Questions