Reputation: 61
I am trying to generate uniformly distributed random numbers with the mt19937 engine and std::random_device as the seed source. If I get lucky I get a couple hundred thousands unique number out of the 4 billion possible values. I was wondering if it could get any better than this.
I attempted to increase the entropy using high resolution timer and the random device using seed_seq (https://stackoverflow.com/a/34493057/5852409) also tried initializing all the 624 states of the mt19937 (https://codereview.stackexchange.com/a/109266). However, did not see any improvement.
#include <random>
#include <iostream>
#include <set>
void main()
{
std::random_device rd;
std::mt19937 engn(rd());
std::uniform_int_distribution<unsigned int> unidist(0, 0xFFFFFFFF - 1);
std::set<unsigned int> s;
auto itr = s.insert(unidist(engn));
int k = 0;
while (itr.second)
{
itr = s.insert(unidist(engn));
k++;
}
std::cout << k << '\n';
}
Upvotes: 0
Views: 110
Reputation: 6706
First and foremost, make sure you understand the birthday paradox. I.e. the fact that you get a duplicate after some ten or hundred thousand numbers does not indicate a statistical deficiency in the mt19937
.
As a rough estimate due to this paradox, duplicates become likely after about the square root of possible values even for a perfect random generator, in this case after about 2^16 = 65536 values.
Second, note that entropy does not mean uniqueness of outputs. Imagine throwing a perfectly fair 100-sided die 100 times. The likelihood that at least one value appears twice is actually much greater than the likelihood that each value is seen exactly once. Entropy is a measure for the number of states in a system. Entropy in your case relates to the quality of your seed (covering many different initial states of the PRNG), not the uniqueness of outputs.
Third, if you have a use case where you must ensure uniqueness (of IDs or handles, for example), but you need poor predictability aka randomness, you have basically two options:
Upvotes: 2