Reputation: 56547
I have the following question. Suppose I want to generate a std::vector<bool>
, or std::vector<unsigned int>
, or even a C-style array that contains only 0's and 1's, uniformly distributed (i.e., at each run should have something like 0110 or 1011, you get the idea). There are 2 approaches:
std::uniform_int_distribution(0,1)
0
and 2^n-1
(where n
is the number of bits), then use a bitset<n>
I am talking here about large vectors, and did some timing but didn't get any clear insight.
Does anyone know which method is more efficient? In my opinion the second approach should be better, but I am not convinced.
Upvotes: 1
Views: 248
Reputation: 3557
If performance is a concern and your output set needs to be large, I'd skip vector<bool>. While compact, they are lower performance to access an individual member.
Either use a std::vector<unsigned int>
and call .resize(N)
on it to make sure it's large enough, or allocate a C style array via malloc. Either of these will have performant access if you use [index] to access.
Secondly, get your own RNG, many system ones suck. e.g. MSVC uses an LCG, albeit shifted right by 16 bits. Even so, http://en.wikipedia.org/wiki/Linear_congruential_generator notes that they should be avoided if you really care. My recommendation is an Xor-Shift RNG because it's (A) blindingly fast, and (B) designed by George Marsaglia, who was a bit of an RNG expert.
The other advantage is that you can get your own to RNG return a full 32 bit unsigned int per call, versus MSVC rand()
and possibly others which only provide 15 bits per call. Therefore you'll make over twice as many calls to rand()
as you will to your own generator.
To actually fill the array, you'd do something like:
void fillarray(unsigned int *data, unsigned int count)
{
unsigned int randomValue;
for (int index = 0; index < count; index++)
{
if ((index & 31) == 0)
{
randomValue = nextRandom();
}
data[index] = randomValue & 1;
randomValue >>= 1;
}
}
Upvotes: 4