name
name

Reputation: 303

Random number generator performance varies between platforms

I am testing the performance of random number generators in c++ and have come upon some very strange results that I do not understand.

I have tested std::rand vs std::uniform_real_distribution which uses std::minstd_rand.

Code for timing std::rand

auto start = std::chrono::high_resolution_clock::now();

for (int i = 0; i < 1000000; ++i)
    std::rand();

auto finish = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> elapsed = finish - start;
std::cout << "Elapsed time: " << elapsed.count() * 1000 << " ms\n";

Code for timing std::uniform_real_distribution with std:minstd_rand

std::minstd_rand Mt(std::chrono::system_clock::now().time_since_epoch().count());
std::uniform_real_distribution<float> Distribution(0, 1);

auto start = std::chrono::high_resolution_clock::now();

for (int i = 0; i < 1000000; ++i)
    Distribution(Mt);

auto finish = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> elapsed = finish - start;
std::cout << "Elapsed time: " << elapsed.count() * 1000 << " ms\n";

When compiling with Microsoft Visual Studio 2019, on a Dell Latitude 7390 (I7-8650U 1.9Ghz) I get the following speeds:

std::rand -> Elapsed time: 45.7106 ms std::uniform_real_distribution -> Elapsed time: 65.7437 ms

I have compiler optimizations turned on with the additional command line option of -D__FMA__

However when compiling with g++ on a MacBook Air on MacOS High Sierra (1.4Ghz i5) I get the following speeds:

std::rand -> Elapsed time: 9.4547 ms std::uniform_real_distribution -> Elapsed time: 7.9e-05 ms

using terminal command "g++ prng.cpp -o prng -std=c++17 -O3"

Another problem was that on Mac, testing the speed of uniform_real_distribution the speed would vary if I did / did not print the value.

So

std::minstd_rand Mt(std::chrono::system_clock::now().time_since_epoch().count());
std::uniform_real_distribution<float> Distribution(0, 1);

float num;

auto start = std::chrono::high_resolution_clock::now();

for (int i = 0; i < 1000000; ++i)
    num = Distribution(Mt);

auto finish = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> elapsed = finish - start;
std::cout << "Elapsed time: " << elapsed.count() * 1000 << " ms\n";
std::cout << num << '\n';

would give me time of 5.82409 ms

whereas without printing I get 7.9e-05 ms, Note that printing only effects the test for uniform_real_distribution, I do not need to do this for std::rand. I also tested using mersenne instead of which does not suffer from the same issue.

I originally thought that this was compiler optimizations omitting the uniform_real_distribution when it wasn't stored / printed as the variable isn't used and thus can be omitted but then why doesn't the compiler do the same for std::rand, and why do these random functions run faster on Mac than Windows?

EDIT: For clarification mersenne is referring to std::mt19937_64 being used instead of std::minstd_rand for uniform_real_distribution.

Upvotes: 2

Views: 891

Answers (1)

Peter O.
Peter O.

Reputation: 32878

All of the distributions in the C++ standard library (including uniform_real_distribution) use an implementation-defined algorithm. (The same applies to std::rand, which defers to the C standard's rand function.) Thus, it's natural that there would be performance differences between these distributions in different implementations of the C++ standard library. See also this answer.

You may want to try testing whether there are performance differences in the C++ random engines (such as std::minstd_rand and std::mt19937), which do specify a fixed algorithm in the C++ standard. To do so, generate a random number in the engine directly and not through any C++ distribution such as uniform_int_distribution or uniform_real_distribution.


I originally thought that this was compiler optimizations omitting the uniform_real_distribution when it wasn't stored / printed as the variable isn't used and thus can be omitted but then why doesn't the compiler do the same for std::rand[?]

I presume the compiler could do this optimization because in practice, the C++ standard library is implemented as C++ code that's available to the compiler, so that the compiler could perform certain optimizations on that code as necessary. This is unlike with std::rand, which is only implemented as a function whose implementation is not available to the compiler, limiting the optimizations the compiler could do.

Upvotes: 5

Related Questions