zzz2991
zzz2991

Reputation: 757

getting random numbers larger than RAND_MAX

Question 7-9 of Accelerated C++ by Andrew Koenig asks:

7-9. (difficult) The implementation of nrand in §7.4.4/135 will not work for arguments greater than RAND_MAX. Usually, this restriction is no problem, because RAND_MAX is often the largest possible integer anyway. Nevertheless, there are implementations under which RAND_MAX is much smaller than the largest possible integer. For example, it is not uncommon for RAND_MAX to be 32767 (2^15 -1) and the largest possible integer to be 2147483647 (2^31 -1). Reimplement nrand so that it works well for all values of n.

If n > RAN_MAX my thoughts are to take

double temp = n/RAN_MAX + .5;
int mult = temp;
int randomNum = 0;

for (int i = 0; i != mult; mult++)
    randomNum += rand();

then test to see if randomNum < n. Would this work to generate a random number > RAND_MAX? I don't know how to use larger integers than my computer can handle, so I don't think there is any real way to tell.

Upvotes: 5

Views: 7290

Answers (2)

Sigmund Freud
Sigmund Freud

Reputation: 13

Earlier in the text, the authors propose the following definition of nrand:

int nrand(int n) {
  if (n <= 0 || n > RAND_MAX)
    throw domain_error("Argument to nrand is out of range");
  const int bucket_size = RAND_MAX / n;
  int r;
  do r = rand() / bucket_size;
  while (r >= n);
  return r;
}

If n > RAND_MAX, then for some non-negative integers Q and 0 <= R < RAND_MAX we have n == Q * RAND_MAX + R. We can try to randomly generate numbers 0 <= q <= Q and 0 <= r < RAND_MAX separately, and, like the original nrand, retry if q * RAND_MAX + r >= n.

This is correct because if every possible value of q and every possible value of r are equally likely and independently generated, every possible pair (q,r) is equally likely and corresponds to a unique non-negative integer between 0 and Q * RAND_MAX + (RAND_MAX - 1), which is at least as large as n. By discarding the pairs (q,r) that generate numbers that are too large, we remain unbiased toward any number or group of numbers that is acceptable.

We can solve the problem recursively since Q+1 < n, (and r can be generated immediately by nrand(RAND_MAX)).

For simplicity I'm going to assume that the max values of q and r won't cause an overflow in the computation, but that can be detected with additional modifications or by widening the intermediate computations to a type strictly larger than int.

int nrand2(int n) {
  if (n <= 0)
    throw domain_error("Argument to nrand2 is out of range");
  if (n <= RAND_MAX) {
    return nrand(n);
  } else {
    int q = n / RAND_MAX;
    int c; // candidate
    do c = nrand2(q+1) * RAND_MAX + nrand(RAND_MAX);
    while (c >= n);
    return c;
  }
}

I don't know a bound for the number of failures per call parameterized by RAND_MAX and n.

Upvotes: 1

Richard
Richard

Reputation: 61499

If you're truly mucking with integers larger than your computer can handle, that's, well, complicated.

But you do have several options for integers big than int, these include: unsigned int, long, unsigned long, long long, unsigned long long in increasing order of bigness. Just how big the numbers become various depending on your architecture.

For instance, on my machine I have the following:

Data Type:   Bytes  Minimum               Maximum
Short SInt:  2      -32768                32767
Short UInt:  2      0                     65535
UInt:        4      0                     4294967295
SInt:        4      -2147483648           2147483647
ULong:       8      0                     18446744073709551615
SLong:       8      -9223372036854775808  9223372036854775807
ULong Long:  8      0                     18446744073709551615
SLong Long:  8      -9223372036854775808  9223372036854775807

So, as you can see, you can make numbers much larger than int and 32767.

One way to do this is as follows:

double a=rand()/(double)RAND_MAX;
unsigned long long random_n=(unsigned long long)(BIG_MAXIMUM_NUMBER*a);

However, due to the discrete nature of floating-point numbers, this may mean that some values will just never show up in your output stream.

C++11 has a library which solves both this problem and the problem you mention. An example of its usage is:

const int min = 100000;
const int max = 1000000;
std::default_random_engine generator;
std::uniform_int_distribution<int> distribution(min,max);
int random_int = distribution(generator);

Just change the data types to suit your big needs.

Another way to look at this is that we can interpret rand() as returning a bit-field and that, since it is a uniform PRNG, all bit-fields are equally likely. We can then just make multiple calls to rand() to get multiple equally-likely bit-fields and merge these to make big numbers. Here's how we would do this to make a 16-bit random number from two 8-bit random numbers:

uint16 a=(uint16)(rand()&255);
uint16 b=(uint16)(rand()&255);
uint16 random_int=b<<8 | a;

The rand()&255 keeps only the 8 least significant bits of whatever number rand() returns; that is, it keeps only the last byte of rand().

The (uint16) casts this byte into an unsigned 16-bit number.

a<<8 shifts the bits of a 8 bits to the left, which makes room to safely add b.

But what if rand() returns a signed-value, such that the most-significant bit is always 0 or 1? We can then do the following:

uint16 a=(uint16)(rand()&255);
uint16 b=(uint16)(rand()&255);
uint16 c=(uint16)(rand()&1);
uint16 random_int=c<<14 | b<<7 | a;

We left-shift b only 7-bits so that the 8th least significant bit is random. This means the 14th and 15th least significant bits will be non-random. Since we want to mimic the behaviour of rand(), we leave the 15th least significant bit non-random, and grab a single random bit to left-shift into the 14th LSB's place.

Upvotes: 4

Related Questions