Reputation: 2139
So, I'm trying to create a random vector (think geometry, not an expandable array), and every time I call my random vector function I get the same x value, though y and z are different.
int main () {
srand ( (unsigned)time(NULL));
Vector<double> a;
a.randvec();
cout << a << endl;
return 0;
}
using the function
//random Vector
template <class T>
void Vector<T>::randvec()
{
const int min=-10, max=10;
int randx, randy, randz;
const int bucket_size = RAND_MAX/(max-min);
do randx = (rand()/bucket_size)+min;
while (randx <= min && randx >= max);
x = randx;
do randy = (rand()/bucket_size)+min;
while (randy <= min && randy >= max);
y = randy;
do randz = (rand()/bucket_size)+min;
while (randz <= min && randz >= max);
z = randz;
}
For some reason, randx will consistently return 8, whereas the other numbers seem to be following the (pseudo) randomness perfectly. However, if I put the call to define, say, randy before randx, randy will always return 8.
Why is my first random number always 8? Am I seeding incorrectly?
Upvotes: 16
Views: 30216
Reputation: 2206
rand
always generate the same number?There are three potential causes to investigate. One or more may be contributing factors.
rand
rand
onto the desired output rangeA lot has changed in the 14 years since the OP was written in 2010. Back then, C++11 was still in the draft stage, and <random>
had not yet been officially adopted. The concept of separate random number engines and random number distributions had not yet taken hold.
The OP asks, "[Why is] rand()
generating the same number – even with srand(time(NULL))
in my main
?" What he means is, "Why does the random number distribution coded in function randvec
always generate the same number, on the first call, after seeding rand
?" As is shown in the data below, so long as it is given different seeds, rand
, itself, does generate numbers that are, at least, a little bit different.
rand
?In the comments, the OP, @Nick Sweet, revealed that he was using Xcode in 2010. Cross-checking dates, that means his system was running OS X. According to Wikipedia, Apple's implementation of rand
at that time was part of CarbonLib. It used the multiplicative congruential generator (MCG) published as MINSTD, in 1988, by Park and Miller. The same generator can be found in header <random>
, as std::minstd_rand0
.
In the 1980s, when microprocessors were still using 16-bit words, and clock rates were measured in megahertz, MINSTD had a place. Today, it has fallen out of favor. Among other things, its period of 2^31 is considered by many to be unacceptably short.
Is MINSTD the problem? To find out, I tested five random number engines:
std::minstd_rand0
– MINSTD (Park & Miller, 1988)std::minstd_rand
– MINSTD (Park & Miller, 1993)tbx::knuth_lcg
– 64-bit LCG (linear congruential generator) used by pcg32
tbx::pcg32_engine
– 32-bit PCG (permuted congruential generator) designed by M.E. O'Neill. It implements O'Neill's pcg32_oneseq
.std::mt19937
– 32-bit Mersenne Twister engineSource code for tbx::pcg_engine
is too long to be included here. It was adapted from the program published by M.E. O'Neill.
Source code for tbx::knuth_lcg
, however, is quite short. Due to its period of 2^64, and its well-chosen parameters, tbx::knuth.lcg
is superior to MINSTD.
#pragma once
// tbx.random.knuth_lcg.h
#include <cstdint>
#include <random>
namespace tbx
{
// Donald E. Knuth
// The Art of Computer Programming, Volume 2
// Second Edition, 1981
//
// Section 3.3.4 Table 1
// Page 102: Table 1 lists LCG constants for a variety of generators.
// Page 105: "Line 30 (whose excellent multiplier 6364136223846793005
// is too big to fit in the column) is due to C.E. Haynes."
//
// Knuth's book gives no mention of the increment 1442695040888963407.
//
// Melissa E. O'Neill describes this as "Knuth's preferred 64-bit LCG."
// https://www.pcg-random.org/posts/cpp-seeding-surprises.html
using knuth_lcg = std::linear_congruential_engine
< std::uint64_t // result_type
, 6'364'136'223'846'793'005ull // a = multiplier
, 1'442'695'040'888'963'407ull // c = increment
, 0ull // m = modulus (true modulus is 2^64)
>;
}
// end file: tbx.random.knuth_lcg.h
std::time(NULL)
?On many systems, the time values returned by std::time(NULL)
have a one-second resolution. When a fast-running program makes several calls to srand(time(NULL))
, there is a significant chance that all of them will use the same seed!
If the time values used as seeds are not identical, they may be very similar, perhaps separated by only a few seconds. In that case, if the PRNG does not condition the seed in some way, to further randomize it, then the initial outputs from the PRNG could be similar. Cycling the generator a "small" number of times can lessen the effect of such a seeding.
Is the seed the problem? To find out, each random number engine was tested with two seeding methods. Tests were run at one-second intervals.
std::srand(std::time(NULL))
std::srand(std::random_device{}());
std::random_device
is part of the <random>
header. It generates 32-bit unsigned values that are intended to be used as seeds. Depending on the compiler, the values it generates can be truly random values, generated in hardware, or high-quality, pseudorandom values, often generated by a cryptographically secure pseudorandom number generator (CSPRNG).
Note: std::random_device
comes with its own set of issues. In particular, MinGW distributions of GCC, prior to version 9.2, were buggy. Here is the report. Those versions used std::mt19937
with a fixed seed, and std::random_device
generated the same sequence, every time.
Many of the commenters from 2010 did not recognize that function randvec
implements a rejection sampling algorithm, to generate uniformly distributed random values. A surprising number of them recommend ditching the loops that it uses, in favor of the biased values produced by:
inline int biased_rand(int const a, int const b) {
// Genenerate a random value on the closed interval [a, b].
return a + std::rand() % (b - a + 1); // BIASED! - not uniformly distributed
}
It's hard to blame them. The rejection tests in the OP were miscoded, using &&
, instead of ||
. No wonder folks couldn't see what was happening.
According to the OP, function randvec
was adapted from an algorithm published in Accelerated C++ (2000), by Andrew Koenig & Barbara E. Moo. The algorithm was implemented as function nrand
, in Chapter 7 of that book. Compared to the biased technique above, it uses a quotient, rather than a remainder, to map the values returned by rand
onto the output range. Because of this, it is sensitive to the high-order bits coming from rand
, rather than the low-order bits. When two values from rand
share the same high-order bits, the values returned by nrand
will often be the same. Conversely, when the high-order bits vary, so will the values returned by nrand
. It is less sensitive to changes in the low-order bits.
Here is the source code from Accelerated C++:
//======================================================================
// nrand
// return a random integer in the range [0, n)
// Source: Andrew Koenig & Barbara E. Moo – 'Accelerated C++' (2000)
// Section 7.4.4 Selecting a random element
// Algorithm: Use "rejection sampling" to guarantee that generated
// values are uniformly distributed.
//======================================================================
int nrand(int n)
{
if (n <= 0 || n > RAND_MAX)
throw domain_error("Argument to nrand is out of range");
const int bucket_size = RAND_MAX / n;
int r;
do r = rand() / bucket_size;
while (r >= n);
return r;
}
nrand
works by dividing the input range [0, RAND_MAX] into n
equally sized "buckets," numbered 0
to n-1
. Each bucket contains bucket_size
distinct values from the input range. Only full buckets are allowed. If there are leftover values from [0, RAND_MAX], which are not assigned to any bucket, they are rejected, whenever one of them is produced by a call to rand
.
The integer quotient rand() / bucket_size
produces a result on the range [0, n]. In the program, the quotient is called r
, but you can think of it as bucket_number
. When r
equals n
, it is rejected, and the program loops back, to call rand
again. Otherwise, r
is a "bucket number" that lies on the half-open interval [0, n).
Note: Although the algorithm in nrand
is correct, the values it returns lie on the half-open interval [0, n). When nrand
is invoked with n
equals RAND_MAX
, it can never return RAND_MAX
. See below for a way to extend the algorithm in nrand
, so that RAND_MAX
can be returned.
The following function uses nrand
to generate uniformly distributed values on the closed range [a, b]. Note, however, that you cannot call it with a
equals 0, and b
equals RAND_MAX
. If you do, the argument to nrand
will be RAND_MAX + 1
, and one of two things will happen:
RAND_MAX
is less than INT_MAX
, nrand
will throw an exception.RAND_MAX + 1
will overflow, and undefined behavior will follow.inline int uniform_rand(int const a, int const b) {
// Generate a random value on the closed interval [a, b].
return a + nrand(b - a + 1); // values are uniformly distributed on [a, b]
}
Given these functions, function randvec
, from the OP, can be coded as follows:
// random Vector - from the OP
template <class T>
void Vector<T>::randvec()
{
const int a = -10, b = 10;
x = uniform_rand(a, b);
y = uniform_rand(a, b);
z = uniform_rand(a, b);
}
Is nrand
the reason why randvec
fails on the first call after seeding? To find out, I tested each combination of RNG and seed with the two random number functions given above:
biased_rand(-10, 10)
uniform_rand(-10, 10)
Here is an improved version of uniform_rand
, which can return RAND_MAX
. Given that std::rand
is best avoided, it is more of a curiosity than anything else. If, however, you do choose to use rand
, you should consider using this function. For production work, you should use std::uniform_int_distribution
, instead.
#pragma once
// tbx.random.uniform_rand.h
namespace tbx
{
int uniform_rand(int const a, int const b);
}
// end file: tbx.random.uniform_rand.h
// tbx.random.uniform_rand.cpp
#include <cstdlib> // rand, srand. RAND_MAX
#include <random> // random_device
#include <stdexcept> // runtime_error
#include "tbx.random.uniform_rand.h"
namespace tbx
{
//==================================================================
// uniform_rand
// Return a uniformly distributed, pseudo-random integer on the
// closed interval [a, b]. This function uses "rejection sampling."
//
// Precondition: (a <= b) && (b - a <= RAND_MAX)
//
// It is implementation-defined whether `std::rand` is thread-safe.
// If `std::rand` is thread-safe, then `tbx::uniform_rand` will
// also be thread-safe, but not otherwise.
//==================================================================
int uniform_rand(int const a, int const b)
{
// Call `srand` on the first pass, and never again, thereafter.
static thread_local bool const is_seeded{
[]() { std::srand(std::random_device{}()); return true; }()
};
if (a > b) {
throw std::runtime_error("tbx::uniform_rand: a > b");
}
static_assert(sizeof(long long) > sizeof(int));
auto const aa{ static_cast<long long>(a) };
auto const bb{ static_cast<long long>(b) };
auto const rand_max{ static_cast<long long>(RAND_MAX) };
if (bb - aa > rand_max) {
throw std::runtime_error("tbx::uniform_rand: b - a > RAND_MAX");
}
unsigned const desired_range{ 1u + static_cast<unsigned>(bb - aa) };
unsigned const range_of_rand{ 1u + static_cast<unsigned>(RAND_MAX) };
unsigned const bucket_size{ range_of_rand / desired_range };
unsigned n; // "bucket number"
do {
n = static_cast<unsigned>(std::rand()) / bucket_size;
} while (n >= desired_range);
return a + static_cast<int>(n);
}
}
// end file: tbx.random.uniform_rand.cpp
This first table shows the output from MINSTD. The first five columns contain data for the seed std::time(NULL)
. The final four columns, have the seed set to std::random_device{}()
.
In columns 1 and 2, the time values increase by 1 second for each row. That's because I programmed the thread to sleep for 1 second between rows.
In column 3, the values returned by rand
are very similar. They all begin with 366
, and the fourth digit is either 3
, 4
, 5
or 6
. With MINSTD, using the current time, alone, as a seed, is not a good idea.
Columns 4 shows the values returned by uniform_rand
on the first call after seeding. It never changes. This combination of RNG and seed is a failure.
Column 5 shows the values returned by biased_rand
. Note the repeating pattern in column 5. This is also a failure.
We now have the answers to the two questions in the OP.
Why is my first random number always 8? Am I seeding incorrectly?
The answers are: (1) When closely spaced calls to std::time(NULL)
are used as seeds for MINSTD, the results returned by its first invocation are less than random. (2) Yes. See Simple Portable C++ Seed Entropy by M.E. O'Neill for some ideas on a better seeding. Otherwise, just use std::random_device{}()
.
My untested hypothesis is the the additive constant, which is 0 in MINSTD, is a contributing factor in the failures in columns 3-5. Comparitively, tbx::knuth_lcg
did a better job.
MINSTD is a multiplicative congruential generator (MCG). In addition to its modulus, the state transition function in MINSTD has only one parameter: its multiplier. With tbx::knuth_lcg
, there are two: a multiplier and an additive constant. The combination means that similar seeds diverge more in a single cycle of tbx::knuth_lcg
than they do in MINSTD.
At least, that's the hypothesis.
The next four columns demonstrate that MINSTD is usable, but only if you have a better seed.
Column 6 shows the seeds returned by std::random_device
. Anecdotally, they appear to be nice and random. So do the values returned from the first call to rand
, which are shown in column 7.
Columns 8 and 9 show the output generated by functions uniform_rand
and biased_rand
. Anecdotally, both columns appear to be random, but we know that rigorous testing would reveal that the values in column 9 have a tiny bias.
minstd_rand0 a: -10 b: 10
1st call seed with 1st call
std::time(NULL) seed to rand uniform_rand biased_rand random_device to rand uniform_rand biased_rand
--------------- ---------- ---------- ------------ ----------- ------------- ---------- ------------ -----------
1725470667 1725470667 366331181 -7 10 79934811 1287089102 2 -5
1725470668 1725470668 366347988 -7 -4 523355155 2084555620 10 -3
1725470670 1725470670 366381602 -7 10 814741099 997917621 -1 5
1725470671 1725470671 366398409 -7 -4 3582022895 502236267 -6 5
1725470672 1725470672 366415216 -7 3 3765154492 1050920895 0 -7
1725470673 1725470673 366432023 -7 10 2560642594 1147791478 1 -9
1725470674 1725470674 366448830 -7 -4 1060201997 1143144420 1 2
1725470675 1725470675 366465637 -7 3 3896449311 109754712 -9 8
1725470676 1725470676 366482444 -7 10 499904143 932904337 -1 -9
1725470677 1725470677 366499251 -7 -4 2931344698 1687993459 6 9
1725470678 1725470678 366516058 -7 3 2601193000 1926149021 8 10
1725470679 1725470679 366532865 -7 10 415279526 281140732 -8 9
1725470680 1725470680 366549672 -7 -4 1042354394 1826191379 7 -5
1725470681 1725470681 366566479 -7 3 1405921226 555477441 -5 5
1725470682 1725470682 366583286 -7 10 3238867965 1238403599 2 10
1725470683 1725470683 366600093 -7 -4 4147558193 691368131 -4 10
1725470684 1725470684 366616900 -7 3 1312675989 1045841492 0 7
1725470685 1725470685 366633707 -7 10 4202573649 1918168913 8 7
1725470686 1725470686 366650514 -7 -4 66238746 878074876 -2 -3
1725470687 1725470687 366667321 -7 3 3397492534 66845208 -10 -7
Data for std::minstd_rand
(not shown), have the same failings (and successes) as the data above.
tbx::knuth_lcg
fared better than MINSTD. When faced with a series of similar seeds (column 2), it generated what appears, anecdotally, to be random output (column 3).
tbx::knuth_lcg
, however, has its own problems. To the naked eye, the values in column 3 look random, but the values in column 4 raise suspicions. There are not enough repeats. Think of the Birthday Problem.
We drew 20 values, from a population of 21, and there were only two repeats (0 and -5). Column 8, where the seeding is better, is more satisfying. It has many repeats, and even a couple of triples.
In many runs of the program (but not all), the same pattern appeared. It happened enough to concern me, even though I am just using the "eyeball" test to evaluate the data.
When the seeds are good, as in column 6, tbx::knuth_lcg
yields good results.
knuth_lcg a: -10 b: 10
1st call seed with 1st call
std::time(NULL) seed to rand uniform_rand biased_rand random_device to rand uniform_rand biased_rand
--------------- ---------- ----------------------- ------------ ----------- ------------- ----------------------- ------------ -----------
1725470729 1725470729 7682619102818709988 7 -9 2761315232 8143472074341520495 8 9
1725470730 1725470730 4823383289810727185 0 -5 1940100256 8838089308847210863 10 0
1725470731 1725470731 1964147476802744382 -6 -1 4177479439 1647040385966567410 -7 -3
1725470732 1725470732 8328283700649537387 8 -10 4232132119 7189879011191188570 6 3
1725470733 1725470733 5469047887641554584 2 -6 1506009362 1118517872796630905 -8 1
1725470734 1725470734 2609812074633571781 -5 -2 1514642789 6095452229502519056 3 -5
1725470735 1725470735 8973948298480364786 10 10 1095415460 119404177483429923 -10 -10
1725470736 1725470736 6114712485472381983 3 -7 583154339 8072170026490244342 8 7
1725470737 1725470737 3255476672464399180 -3 -3 3460774016 6240305398129356751 4 -9
1725470738 1725470738 396240859456416377 -10 1 4039019375 1192875450421810386 -8 -10
1725470739 1725470739 6760377083303209382 5 -8 3183640201 3021312392947218020 -4 4
1725470740 1725470740 3901141270295226579 -2 -4 2846194354 6061817392193233049 3 -2
1725470741 1725470741 1041905457287243776 -8 0 12515812 3252120412882307683 -3 0
1725470742 1725470742 7406041681134036781 6 -9 4079972030 3154528753519180469 -3 -5
1725470743 1725470743 4546805868126053978 0 -5 2809010452 8336560618007352787 8 -3
1725470744 1725470744 1687570055118071175 -7 -1 2377801474 747086193202967209 -9 -6
1725470745 1725470745 8051706278964864180 8 -10 4231172747 1009655311069690046 -8 10
1725470746 1725470746 5192470465956881377 1 -6 2072539433 7046472142399192196 6 -2
1725470747 1725470747 2333234652948898574 -5 -2 2237413946 8140891742275797889 8 3
1725470748 1725470748 8697370876795691579 9 10 386168962 4228694408572382761 -1 9
Data for std::mt19937
were uniformly good.
The reason it does well with poor seeding is likely due to the fact that it conditions the seeds you give it, before installing them in the engine.
When you seed std::mt19937
with a single (unsigned) integer, it pushes that seed through a seed sequence, to fill the 624 state variables used by the engine. On the outside, the seeds in column 2 all look similar. By the time they hit the engine, however, they have been significantly "randomized."
mt19937 a: -10 b: 10
1st call seed with 1st call
std::time(NULL) seed to rand uniform_rand biased_rand random_device to rand uniform_rand biased_rand
--------------- ---------- ---------- ------------ ----------- ------------- ---------- ------------ -----------
1725470688 1725470688 1846392152 8 7 3488227490 1683360094 6 0
1725470689 1725470689 923719378 -1 -9 4172931053 1019958242 -1 -8
1725470690 1725470690 1919984915 8 -8 3443978328 1172527630 1 -9
1725470691 1725470691 1612674401 5 -8 3542463677 926495713 -1 0
1725470692 1725470692 339483995 -7 1 2534471012 1768656412 7 9
1725470693 1725470693 881359337 -2 -5 38428959 47618872 -10 -3
1725470694 1725470694 1481684138 4 1 3098883754 727394473 -3 -9
1725470695 1725470695 1289557349 2 7 1460710575 351019927 -7 3
1725470696 1725470696 593798350 -5 9 167206676 1070657304 0 -7
1725470697 1725470697 695392031 -4 -5 1215347282 924400893 -1 -7
1725470698 1725470698 1152322541 1 10 3860368725 1741906414 7 0
1725470699 1725470699 111164029 -9 -6 2743553328 377480583 -7 8
1725470700 1725470700 1998001782 9 -10 2268784299 827285329 -2 0
1725470701 1725470701 814093545 -3 -4 1593534606 548682246 -5 -10
1725470702 1725470702 1511789889 4 5 142519159 1539300695 5 -8
1725470703 1725470703 1885533215 8 10 3120646034 798241165 -3 -9
1725470704 1725470704 61821576 -10 2 2487963867 659747497 -4 0
1725470705 1725470705 1211428588 1 3 531291314 1551904191 5 2
1725470706 1725470706 1724047358 6 -5 1915141704 367916844 -7 5
1725470707 1725470707 1375067752 3 6 1484660056 2086820278 10 -6
Results for pcg_engine
, which also conditions its seeds, were similar.
The primary conclusion is that std::time(NULL)
is a poor source of random seeds. When a program makes several calls to it, the odds are good that the seeds it returns will be very similar. In the worst case, which is not uncommon, the seeds will be identical.
When the values returned by std::time(NULL)
were not identical, they worked when used as seeds for std::mt19937
and pcg_engine
. They failed, however, with the other engines. When the values were identical, of course, they failed for all engines.
std::time(NULL)
can still be used as a seed, but it should be combined with other sources of "entropy." This can be accomplished by exclusive-or'ing it with the addresses of certain objects in memory or with a call to std::random_device
(or both). See Simple Portable C++ Seed Entropy, by M.E. O'Neill.
Otherwise, just use std::random_device
to generate your seeds.
Although it has a checkered history, std::rand
might not be as much of a problem as std::time(NULL)
. That depends, of course, on your system. Even when rand
is implemented using the lowly MINSTD generator, as in the OP, the foregoing data show that proper seeding can yield serviceable results. Given a choice, you should prefer something like std::mt19937
or pcg32_engine
. For lightweight applications, however, rand
may be fine.
As for function nrand
, the random number distribution, that's a case of garbage-in, garbage-out. When you feed it a series of similar random values, in which the high-order bits are constant, it is to be expected that nrand
will struggle. Interestingly, in tests of both std::minstd_rand0
and std::minstd_rand
, the times when nrand
struggled were also the times when biased_rand
had problems of its own.
Upvotes: 0
Reputation: 21
Not directly related to the code in this question, but I had same issue with using
srand ((unsigned)time(NULL))
and still having same sequence of values being returned from following calls to rand()
.
It turned out that srand needs to called on each thread you are using it on separately. I had a loading thread that was generating random content (that wasn't random cuz of the seed issue). I had just using srand in the main thread and not the loading thread. So added another srand ((unsigned)time(NULL))
to start of loading thread fixed this issue.
Upvotes: 0
Reputation: 49
I had the same problem exactly. I fixed it by moving the srand() call so it was only called once in my program (previously I had been seeding it at the top of a function call). Don't really understand the technicalities - but it was problem solved.
Upvotes: 4
Reputation: 9866
I don't see any problem with your srand()
, and when I tried running extremely similar code, I did not repeatedly get the same number with the first rand()
. However, I did notice another possible issue.
do randx = (rand()/bucket_size)+min;
while (randx <= min && randx >= max);
This line probably does not do what you intended. As long as min < max
(and it always should be), it's impossible for randx
to be both less than or equal to min
and greater than or equal to max
. Plus, you don't need to loop at all. Instead, you can get a value in between min and max using:
randx = rand() % (max - min) + min;
Upvotes: 5
Reputation: 263350
A simple quickfix is to call rand
a few times after seeding.
int main ()
{
srand ( (unsigned)time(NULL));
rand(); rand(); rand();
Vector<double> a;
a.randvec();
cout << a << endl;
return 0;
}
Just to explain better, the first call to rand() in four sequential runs of a test program gave the following output:
27592
27595
27598
27602
Notice how similar they are? For example, if you divide rand()
by 100, you will get the same number 3 times in a row. Now take a look at the second result of rand() in four sequential runs:
11520
22268
248
10997
This looks much better, doesn't it? I really don't see any reason for the downvotes.
Upvotes: 3
Reputation: 19
Your implementation, through integer division, ignores the smallest 4-5 bit of the random number. Since your RNG is seeded with the system time, the first value you get out of it will change only (on average) every 20 seconds.
This should work:
randx = (min) + (int) ((max - min) * rand() / (RAND_MAX + 1.0));
where
rand() / (RAND_MAX + 1.0)
is a random double value in [0, 1) and the rest is just shifting it around.
Upvotes: 1
Reputation:
The issue is that the random number generator is being seeded with a values that are very close together - each run of the program only changes the return value of time() by a small amount - maybe 1 second, maybe even none! The rather poor standard random number generator then uses these similar seed values to generate apparently identical initial random numbers. Basically, you need a better initial seed generator than time() and a better random number generator than rand().
The actual looping algorithm used is I think lifted from Accelerated C++ and is intended to produce a better spread of numbers over the required range than say using the mod operator would. But it can't compensate for always being (effectively) given the same seed.
Upvotes: 10
Reputation: 4985
Also to mention, you can even get rid of that strange bucket_size
variable and use the following method to generate numbers from a
to b
inclusively:
srand ((unsigned)time(NULL));
const int a = -1;
const int b = 1;
int x = rand() % ((b - a) + 1) + a;
int y = rand() % ((b - a) + 1) + a;
int z = rand() % ((b - a) + 1) + a;
Upvotes: 3