drb
drb

Reputation: 738

random_shuffle algorithm - are identical results produced without random generator function?

If a random generator function is not supplied to the random_shuffle algorithm in the standard library, will successive runs of the program produce the same random sequence if supplied with the same data?

For example, if

std::random_shuffle(filenames.begin(), filenames.end());

is performed on the same list of filenames from a directory in successive runs of the program, is the random sequence produced the same as that in the prior run?

Upvotes: 6

Views: 3309

Answers (3)

Mark B
Mark B

Reputation: 96241

25.2.11 just says that the elements are shuffled with uniform distribution. It makes no guarantees as to which RNG is used behind the scenes (unless you pass one in) so you can't rely on any such behavior.

In order to guarantee the same shuffle outcome you'll need to provide your own RNG that provides those guarantees, but I suspect even then if you update your standard library the random_shuffle algorithm itself could change effects.

Upvotes: 6

James Kanze
James Kanze

Reputation: 153919

If you use the same random generator, with the same seed, and the same starting sequence, the results will be the same. A computer is, after all, deterministic in its behavior (modulo threading issues and a few other odds and ends).

If you do not specify a generator, the default generator is implementation defined. Most implementations, I think, use std::rand() (which can cause problems, particularly when the number of elements in the sequence is larger than RAND_MAX). I would recommend getting a generator with known quality, and using it.

If you don't correctly seed the generator which is being used (another reason to not use the default, since how you seed it will depend on the implementation), then you'll get what you get. In the case of std::rand(), the default always uses the same seed. How you seed depends on the generator used. What you use to seed it should be vary from one run to the other; for many applications, time(NULL) is sufficient; on a Unix platform, I'd recommend reading however many bytes it takes from /dev/random. Otherwise, hashing other information (IP address of the machine, process id, etc.) can also improve things---it means that two users starting the program at exactly the same second will still get different sequences. (But this is really only relevant if you're working in a networked environment.)

Upvotes: 7

user195488
user195488

Reputation:

You may produce an identical result every run of the program. You can add a custom random number generator (which can be seeded from an external source) as an additional argument to std::random_shuffle if this is a problem. The function would be the third argument. Some people recommend call srand(unsigned(time(NULL))); before random_shuffle, but the results are often times implementation defined (and unreliable).

Upvotes: 4

Related Questions