Reputation: 427
For example, array of pairs of ranges p={[23, 28], [11, 14], [31, 39]}
You have to return random number such that this number should be:
I can easily generate a random number that lies between any of these intervals like
int x=(rand()%(p[i].second-p[i].first) )+ p[i].first;
How can I generate a random number that can lie in any interval in the array and that too with equal probability.
Upvotes: 1
Views: 1177
Reputation: 4733
We could follow a common approach by defining a specialized generator class.
We can overload its () function call operator, so once we have an object of that class, we just get a variate thru variate = genObj()
. Underneath, the class is to call standard C++11 (and beyond) random number generation facilities, as exposed for example in the ISO C++ N3551 paper.
Note that rand()
and srand()
are legacy C facilities.
In this example, there are 19 possible variates. We number them from 0 to 18. First, we generate uniformly a random integer in [0..18]. This is called the rank. Second, from the rank we compute the actual variate; this second step is known as unranking.
The generator class has a drawRank()
method which returns a rank.
We define the cumulants of the range set as, for each range, the size of that range plus the sum of the sizes of all ranges before itself.
Once we have determined that the variate for a certain rank lies in the i-th range, unranking can be done like this:
variate = if (i == 0) then ranges[0].lowerBound + rank
else ranges[i].lowerBound + (rank - cumulants[i-1])
Side note: as here the ranges are quite small, instead we could unrank with just a simple vector used as a lookup table, translating ranks to variates directly. But if some ranges become large, like [1000000,199999], this approach obviously becomes a memory wasting one.
So we can now proceed to write our specialized generator class.
First, the public part, with the #include
clauses and the constructor:
#include <cstdlib>
#include <map>
#include <random>
#include <iostream>
using Ranges = std::vector<std::pair<int,int>>;
class RangesGen
{
public:
// Constructor:
RangesGen(int randomSeed_, const Ranges& ranges_) :
randomSeed(randomSeed_), ranges(ranges_),
engine(randomSeed_), ud(0, getTotalCount(ranges_)-1),
totalCount(getTotalCount(ranges_)),
lengths(), cumulants()
{
for (auto p: ranges) {
lengths.push_back(1 + p.second - p.first);
cumulants.push_back(0); // just to fill it up
}
cumulants[0] = lengths[0];
for (int i=1; i < lengths.size(); i++)
cumulants[i] = cumulants[i-1] + lengths[i];
}
The overloaded function call operator, with the unranking logic, is also in the public part of the class:
// return a random variate:
int operator()()
{
int rank = drawRank();
int i = 0;
while (cumulants[i] <= rank)
i++;
// here rank < cum[i] - so rank fits in the i-th range
// unranking logic :
if (i==0)
return (ranges[0].first + rank);
else
return ( ranges[i].first + (rank - cumulants[i-1]) );
}
Second, the private part, with the actual list of fields and the drawRank()
private method:
private:
int randomSeed;
std::mt19937 engine; // raw random number generator
std::uniform_int_distribution<int> ud;
Ranges ranges;
int totalCount;
bool verbose;
std::vector<int> lengths;
std::vector<int> cumulants;
static int getTotalCount(const Ranges& ranges)
{
int total = 0;
for (auto p: ranges)
total += 1 + p.second - p.first;
return total;
}
int drawRank()
{
return ud(engine);
}
};
We can write a small test program, which generates 19*2,000,000 = 38 millions variates, and then computes the corresponding histogram. As this is supposed to be an uniform distribution, we expect each variate to get close to 2 millions instances.
int main()
{
Ranges myRanges {{23, 28}, {11, 14}, {31, 39}};
int randomSeed {424344};
RangesGen gen1(randomSeed, myRanges); // specialized variate generator
std::map<int,int> histogram{}; // empty map
for (int j=0; j < 19*2000000; j++) {
int variate = gen1();
if (histogram.end() == histogram.find(variate))
histogram[variate] = 0; // First sight of this variate, insert it.
int prevCount = histogram[variate];
histogram[variate] = prevCount + 1;
}
// display histogram contents:
for (auto itr = histogram.begin(); itr != histogram.end(); ++itr) {
int variate = itr->first;
int count = itr->second;
std::cout << count << " instances of variate " << variate <<'\n';
}
return EXIT_SUCCESS;
}
2001500 instances of variate 11
1999233 instances of variate 12
1999842 instances of variate 13
1999205 instances of variate 14
1999467 instances of variate 23
1999725 instances of variate 24
2001476 instances of variate 25
1999592 instances of variate 26
2001036 instances of variate 27
2002098 instances of variate 28
2000420 instances of variate 31
2000231 instances of variate 32
2000569 instances of variate 33
1996539 instances of variate 34
2000371 instances of variate 35
1999316 instances of variate 36
2000151 instances of variate 37
1998338 instances of variate 38
2000891 instances of variate 39
So the histogram is as expected.
Upvotes: 1
Reputation: 93
Generally, you shouldn't use rand and scale by division, because it can lead to some numbers being generated more commonly. There is a very good talk about this 'rand Considered Harmful from MSDN a few years ago. In modern C++ you should probably use the random distribution methods brought in in C++11 in the <random>
header. For e.g.:
std::mt19937 generator(0); // Use mersenne twister algorithm with seed 0 - shouldn't use same seed repeatedly but generate this in some way
// set up a uniform distribution:
std::uniform_int_distribution<> distribution(23, 28);
// generate a random number in that range
distribution(generator)
Upvotes: -1
Reputation: 234655
Draw a number uniformly in the range [0, 18].
Add 11.
If that number is 15 or more then add 8.
If that number is 29 or more then add 2.
This method has the advantage that it uses only one drawing. Using multiple drawings can cause the statistical properties of the random numbers to deteriorate depending on the properties of the underlying generator. Sampling with rejection can cause issues with some low discrepancy sequences such as Sobol.
Upvotes: 3