Alethes
Alethes

Reputation: 922

Inconsistent C++ <random> behavior between different computers

I've implemented a C++ solution for testing various strategies in reaction to the series of random events. I'm aggregating results from multi-threaded simulations running on a couple of computers.

A single simulation yields one integer outcome and typically requires generating around 100 uniform random integers and is repeated 1,000,000 times before a chunk of aggregated data (mean, standard deviation, minimum, maximum) is saved. Although, the results of such chunks are consistent up to 6 significant digits on a given architecture, the discrepancies between two computers running exactly the same program are orders of magnitude larger.

So far, I ran the program (the same executable) on two personal Windows notebooks with Intel processors and one AWS c3.8xlarge Windows Server instance. On each computer the ongoing simulation quickly approaches a different value. The relative discrepancies between means are of the order of 10^-3. On a single computer, the relative difference of means between 1-million chunks rarely exceeds 10^-6.

The program uses the mt19937 random number generator from <random>. I use time(NULL) for seeding.

I can't come up with a reason for such an inconsistency. The Mersenne Twister is considered a sound generator for Monte Carlo simulations and I used it many times, often being able to analytically verify the results. I can understand slight differences and diversions from uniformity due to generator imperfections and underlying architecture but with such order of magnitude, it's hard to comprehend.

Upvotes: 2

Views: 484

Answers (2)

example
example

Reputation: 3419

It appears you have been able to fix your problem. Let me nevertheless point you to one or two (possible) problems with your code. Without actually seeing any sourccode it is hard to do more for you.

  • Do not use time(NULL) as your seed. It is a very low entropy source - which is rather bad. First, two instances of the program running on different machines might very well pick the same seed and second, two consecutive runs will only have slightly different seeds, which might result in similar or at least correlated random numbers. This second point is especially bad if you create one prng per thread (as it is adviceable to do!) because then all threads might simply create identical numbers. Use at least a seed_seq, but better yet even a truly random source (random_device).
  • While the standard guarantees, that the generator mt19937 will give the same results in its 32 bit variant and (significantly) faster 64 variant, as well as across different hardware and software versions, the distributions do not give you the same guarantee. If you want to reproduce identical results, you should write your own distributions (which is IMHO a really bad design flaw of the standard as it is highly non-trivial to write a good distribution... this should not be necessary in your case though, as long as you only need randomly distributed numbers and never need to recreate a specific sequence of numbers).

While it is (as I already said) unlikely that the difference in distributions caused your problem, this is the only difference there can be between two standard-conforming implementations. In light of this, I would suggest checking the rest of your code thoroughly, as it seems unlikely that the <random> library is actually at fault here.

Upvotes: 1

Alethes
Alethes

Reputation: 922

After refactoring the program and eliminating unnecessary operations, the results became consistent among different hosts. It appears as though rounding errors significantly differ among various, seemingly similar, 64-bit architectures and their accumulation, due to certain design flaws, caused a serious divergence of my simulation's results. I'd like to thank @DanielKO, @TonyD, @amdn and @Yakk for their valuable suggestions.

An interesting note: from the very start, the c3.8xlarge AWS instance consistently provided the same (correct) results. In contrary, Core 2 took the most severe beating.

Upvotes: 1

Related Questions