I'm wondering if there is an algorithm to generate random numbers that most likely will be low in a range from min to max. For instance if you generate a random number between 1 and 100 it should most of the time be below 30 if you call the function with f(min: 1, max: 100, avg: 30) , but if you call it with f(min: 1, max: 200, avg: 10) the most the average should be 10. A lot of games does this, but I simply can't find a way to do this with formula. Most of the examples I have seen uses a "drop table" or something like that. I have come up with a fairly simple way to weight the outcome of a roll, but it is not very efficient and you don't have a lot of control over it var pseudoRand = function(min, max, n) { if (n > 0) { return pseudoRand(min, Math.random() * (max - min) + min, n - 1) } return max; } rands = [] for (var i = 0; i < 20000; i++) { rands.push(pseudoRand(0, 100, 1)) } avg = rands.reduce(function(x, y) { return x + y } ) / rands.length console.log(avg); // ~50 The function simply picks a random number between min and max N times, where it for every iteration updates the max with the last roll. So if you call it with N = 2, and max = 100 then it must roll 100 two times in a row in order to return 100 I have looked at some distributions on wikipedia, but I don't quite understand them enough to know how I can control the min and max outputs etc. Any help is very much welcomed

Generating Random Numbers for RPG games

Answers (12)

Reputation: 664

Well, from what I can see of your problem, I would want for the solution to meet these criteria:

a) Belong to a single distribution: If we need to "roll" (call math.Random) more than once per function call and then aggregate or discard some results, it stops being truly distributed according to the given function.

b) Not be computationally intensive: Some of the solutions use Integrals, (Gamma distribution, Gaussian Distribution), and those are computationally intensive. In your description, you mention that you want to be able to "calculate it with a formula", which fits this description (basically, you want an O(1) function).

c) Be relatively "well distributed", e.g. not have peaks and valleys, but instead have most results cluster around the mean, and have nice predictable slopes downwards towards the ends, and yet have the probability of the min and the max to be not zero.

d) Not to require to store a large array in memory, as in drop tables.

I think this function meets the requirements:

    var pseudoRand = function(min, max, avg )
    {
        var randomFraction = Math.random();
        var head = (avg - min);
        var tail = (max - avg);
        var skewdness = tail / (head + tail);
        if (randomFraction < skewdness)
            return min + (randomFraction / skewdness) * head;
        else
            return  avg + (1 - randomFraction) / (1 - skewdness) * tail;
    }

This will return floats, but you can easily turn them to ints by calling

(int) Math.round(pseudoRand(...))

It returned the correct average in all of my tests, and it is also nicely distributed towards the ends. Hope this helps. Good luck.

Upvotes: 0

nox

Reputation: 252

Having seen much good explanations and some good ideas, I still think this could help you:

You can take any distribution function f around 0, and substitute your interval of interest to your desired interval [1,100]: f -> f'.

Then feed the C++ discrete_distribution with the results of f'.

I've got an example with the normal distribution below, but I can't get my result into this function :-S

#include <iostream>
#include <random>
#include <chrono>
#include <cmath>


using namespace std;


double p1(double x, double mean, double sigma); // p(x|x_avg,sigma)
double p2(int x, int x_min, int x_max, double x_avg, double z_min, double z_max); // transform ("stretch") it to the interval
int plot_ps(int x_avg, int x_min, int x_max, double sigma);

int main()
{
    int x_min = 1;
    int x_max = 20;
    int x_avg = 6;

    double sigma = 5;

    /*
    int p[]={2,1,3,1,2,5,1,1,1,1};

    default_random_engine generator (chrono::system_clock::now().time_since_epoch().count());
    discrete_distribution<int> distribution {p*};

    for (int i=0; i< 10; i++)
        cout << i << "\t" << distribution(generator) << endl;
    */
    plot_ps(x_avg, x_min, x_max, sigma);

    return 0; //*/
}

// Normal distribution function
double p1(double x, double mean, double sigma)
{
    return 1/(sigma*sqrt(2*M_PI))
         * exp(-(x-mean)*(x-mean) / (2*sigma*sigma));
}

// Transforms intervals to your wishes ;)
// z_min and z_max are the desired values f'(x_min) and f'(x_max)
double p2(int x, int x_min, int x_max, double x_avg, double z_min, double z_max)
{
    double y;
    double sigma = 1.0;
    double y_min = -sigma*sqrt(-2*log(z_min));
    double y_max =  sigma*sqrt(-2*log(z_max));
    if(x < x_avg)
        y = -(x-x_avg)/(x_avg-x_min)*y_min;
    else
        y = -(x-x_avg)/(x_avg-x_max)*y_max;
    return p1(y, 0.0, sigma);
}

//plots both distribution functions
int plot_ps(int x_avg, int x_min, int x_max, double sigma)
{
    double z = (1.0+x_max-x_min);

    // plot p1
    for (int i=1; i<=20; i++)
    {
        cout << i << "\t" <<
        string(int(p1(i, x_avg, sigma)*(sigma*sqrt(2*M_PI)*20.0)+0.5), '*')
        << endl;
    }

    cout << endl;

    // plot p2
    for (int i=1; i<=20; i++)
    {
        cout << i << "\t" <<
        string(int(p2(i, x_min, x_max, x_avg, 1.0/z, 1.0/z)*(20.0*sqrt(2*M_PI))+0.5), '*')
        << endl;
    }
}

With the following result if I let them plot:

1   ************
2   ***************
3   *****************
4   ******************
5   ********************
6   ********************
7   ********************
8   ******************
9   *****************
10  ***************
11  ************
12  **********
13  ********
14  ******
15  ****
16  ***
17  **
18  *
19  *
20  

1   *
2   ***
3   *******
4   ************
5   ******************
6   ********************
7   ********************
8   *******************
9   *****************
10  ****************
11  **************
12  ************
13  *********
14  ********
15  ******
16  ****
17  ***
18  **
19  **
20  *

So - if you could give this result to the discrete_distribution<int> distribution {}, you got everything you want...

Upvotes: 0

AlgorithmsX

Reputation: 307

A probability distribution function is just a function that, when you put in a value X, will return the probability of getting that value X. A cumulative distribution function is the probability of getting a number less than or equal to X. A CDF is the integral of a PDF. A CDF is almost always a one-to-one function, so it almost always has an inverse.

To generate a PDF, plot the value on the x-axis and the probability on the y-axis. The sum (discrete) or integral (continuous) of all the probabilities should add up to 1. Find some function that models that equation correctly. To do this, you may have to look up some PDFs.

Basic Algorithm

https://en.wikipedia.org/wiki/Inverse_transform_sampling

This algorithm is based off of Inverse Transform Sampling. The idea behind ITS is that you are randomly picking a value on the y-axis of the CDF and finding the x-value it corresponds to. This makes sense because the more likely a value is to be randomly selected, the more "space" it will take up on the y-axis of the CDF.

Come up with some probability distribution formula. For instance, if you want it so that as the numbers get higher the odds of them being chosen increases, you could use something like f(x)=x or f(x)=x^2. If you want something that bulges in the middle, you could use the Gaussian Distribution or 1/(1+x^2). If you want a bounded formula, you can use the Beta Distribution or the Kumaraswamy Distribution.
Integrate the PDF to get the Cumulative Distribution Function.
Find the inverse of the CDF.
Generate a random number and plug it into the inverse of the CDF.
Multiply that result by (max-min) and then add min
Round the result to the nearest integer.

Steps 1 to 3 are things you have to hard code into the game. The only way around it for any PDF is to solve for the shape parameters of that correspond to its mean and holds to the constraints on what you want the shape parameters to be. If you want to use the Kumaraswamy Distribution, you will set it so that the shape parameters a and b are always greater than one.

I would suggest using the Kumaraswamy Distribution because it is bounded and it has a very nice closed form and closed form inverse. It only has two parameters, a and b, and it is extremely flexible, as it can model many different scenarios, including polynomial behavior, bell curve behavior, and a basin-like behavior that has a peak at both edges. Also, modeling isn't too hard with this function. The higher the shape parameter b is, the more tilted it will be to the left, and the higher the shape parameter a is, the more tilted it will be to the right. If a and b are both less than one, the distribution will look like a trough or basin. If a or b is equal to one, the distribution will be a polynomial that does not change concavity from 0 to 1. If both a and b equal one, the distribution is a straight line. If a and b are greater than one, than the function will look like a bell curve. The best thing you can do to learn this is to actually graph these functions or just run the Inverse Transform Sampling algorithm.

https://en.wikipedia.org/wiki/Kumaraswamy_distribution

For instance, if I want to have a probability distribution shaped like this with a=2 and b=5 going from 0 to 100:

https://www.wolframalpha.com/input/?i=2*5*x%5E(2-1)*(1-x%5E2)%5E(5-1)+from+x%3D0+to+x%3D1

Its CDF would be:

CDF(x)=1-(1-x^2)^5

Its inverse would be:

CDF^-1(x)=(1-(1-x)^(1/5))^(1/2)

The General Inverse of the Kumaraswamy Distribution is: CDF^-1(x)=(1-(1-x)^(1/b))^(1/a)

I would then generate a number from 0 to 1, put it into the CDF^-1(x), and multiply the result by 100.

Pros

Very accurate
Continuous, not discreet
Uses one formula and very little space
Gives you a lot of control over exactly how the randomness is spread out
Many of these formulas have CDFs with inverses of some sort
There are ways to bound the functions on both ends. For instance, the Kumaraswamy Distribution is bounded from 0 to 1, so you just input a float between zero and one, then multiply the result by (max-min) and add min. The Beta Distribution is bounded differently based on what values you pass into it. For something like PDF(x)=x, the CDF(x)=(x^2)/2, so you can generate a random value from CDF(0) to CDF(max-min).

Cons

You need to come up with the exact distributions and their shapes you plan on using
Every single general formula you plan on using needs to be hard coded into the game. In other words, you can program the general Kumaraswamy Distribution into the game and have a function that generates random numbers based on the distribution and its parameters, a and b, but not a function that generates a distribution for you based on the average. If you wanted to use Distribution x, you would have to find out what values of a and b best fit the data you want to see and hard code those values into the game.

Upvotes: 4

Antoine Bergamaschi

Reputation: 104

You may combine 2 random processes. For example:

first rand R1 = f(min: 1, max: 20, avg: 10); second rand R2 = f(min:1, max : 10, avg : 1);

and then multiply R1*R2 to have a result between [1-200] and average around 10 (the average will be shifted a bit)

Another option is to find the inverse of the random function you want to use. This option has to be initialized when your program starts but doesn't need to be recomputed. The math used here can be found in a lot of Math libraries. I will explain point by point by taking the example of an unknown random function where only four points are known:

First, fit the four point curve with a polynomial function of order 3 or higher.
You should then have a parametrized function of type : ax+bx^2+cx^3+d.
Find the indefinite integral of the function (the form of the integral is of type a/2x^2+b/3x^3+c/4x^4+dx, which we will call quarticEq).
Compute the integral of the polynomial from your min to your max.
Take a uniform random number between 0-1, then multiply by the value of the integral computed in Step 5. (we name the result "R")
Now solve the equation R = quarticEq for x.

Hopefully the last part is well known, and you should be able to find a library that can do this computation (see wiki). If the inverse of the integrated function does not have a closed form solution (like in any general polynomial with degree five or higher), you can use a root finding method such as Newton's Method.

This kind of computation may be use to create any kind of random distribution.

Edit :

You may find the Inverse Transform Sampling described above in wikipedia and I found this implementation (I haven't tried it.)

Upvotes: 1

Dimitri Lavrenük

Reputation: 4879

I would use a simple mathematical function for that. From what you describe, you need an exponential progression like y = x^2. at average (average is at x=0.5 since rand gets you a number from 0 to 1) you would get 0.25. If you want a lower average number, you can use a higher exponent like y = x^3 what would result in y = 0.125 at x = 0.5 Example: http://www.meta-calculator.com/online/?panel-102-graph&data-bounds-xMin=-2&data-bounds-xMax=2&data-bounds-yMin=-2&data-bounds-yMax=2&data-equations-0=%22y%3Dx%5E2%22&data-rand=undefined&data-hideGrid=false

PS: I adjusted the function to calculate the needed exponent to get the average result. Code example:

function expRand (min, max, exponent) {
    return Math.round( Math.pow( Math.random(), exponent) * (max - min) + min);
}

function averageRand (min, max, average) {
    var exponent = Math.log(((average - min) / (max - min))) / Math.log(0.5);
    return expRand(min, max, exponent);
}

alert(averageRand(1, 100, 10));

Upvotes: 1

Luke

Reputation: 848

Try this, generate a random number for the range of numbers below the average and generate a second random number for the range of numbers above the average.

Then randomly select one of those, each range will be selected 50% of the time.

var psuedoRand = function(min, max, avg) {
  var upperRand = (int)(Math.random() * (max - avg) + avg);
  var lowerRand = (int)(Math.random() * (avg - min) + min);

  if (math.random() < 0.5)
    return lowerRand;
  else
    return upperRand;
}

Upvotes: 0

Seth

Reputation: 1535

private int roll(int minRoll, int avgRoll, int maxRoll) {
    // Generating random number #1
    int firstRoll = ThreadLocalRandom.current().nextInt(minRoll, maxRoll + 1);

    // Iterating 3 times will result in the roll being relatively close to
    // the average roll.
    if (firstRoll > avgRoll) {
        // If the first roll is higher than the (set) average roll:
        for (int i = 0; i < 3; i++) {
            int verificationRoll = ThreadLocalRandom.current().nextInt(minRoll, maxRoll + 1);

            if (firstRoll > verificationRoll && verificationRoll >= avgRoll) {
                // If the following condition is met:
                // The iteration-roll is closer to 30 than the first roll
                firstRoll = verificationRoll;
            }
        }
    } else if (firstRoll < avgRoll) {
        // If the first roll is lower than the (set) average roll:
        for (int i = 0; i < 3; i++) {
            int verificationRoll = ThreadLocalRandom.current().nextInt(minRoll, maxRoll + 1);

            if (firstRoll < verificationRoll && verificationRoll <= avgRoll) {
                // If the following condition is met:
                // The iteration-roll is closer to 30 than the first roll
                firstRoll = verificationRoll;
            }
        }
    }
    return firstRoll;
}

Explanation:

roll
check if the roll is above, below or exactly 30
if above, reroll 3 times & set the roll according to the new roll, if lower but >= 30
if below, reroll 3 times & set the roll according to the new roll, if higher but <= 30
if exactly 30, don't set the roll anew
return the roll

Pros:

simple
effective
performs well

Cons:

You'll naturally have more results that are in the range of 30-40 than you'll have in the range of 20-30, simple due to the 30-70 relation.

Testing:

You can test this by using the following method in conjunction with the roll()-method. The data is saved in a hashmap (to map the number to the number of occurences).

public void rollTheD100() {

    int maxNr = 100;
    int minNr = 1;
    int avgNr = 30;

    Map<Integer, Integer> numberOccurenceMap = new HashMap<>();

    // "Initialization" of the map (please don't hit me for calling it initialization)
    for (int i = 1; i <= 100; i++) {
        numberOccurenceMap.put(i, 0);
    }

    // Rolling (100k times)
    for (int i = 0; i < 100000; i++) {
        int dummy = roll(minNr, avgNr, maxNr);
        numberOccurenceMap.put(dummy, numberOccurenceMap.get(dummy) + 1);
    }

    int numberPack = 0;

    for (int i = 1; i <= 100; i++) {
        numberPack = numberPack + numberOccurenceMap.get(i);
        if (i % 10 == 0) {
            System.out.println("<" + i + ": " + numberPack);
            numberPack = 0;
        }
    }
}

The results (100.000 rolls):

These were as expected. Note that you can always fine-tune the results, simply by modifying the iteration-count in the roll()-method (the closer to 30 the average should be, the more iterations should be included (note that this could hurt the performance to a certain degree)). Also note that 30 was (as expected) the number with the highest number of occurences, by far.

<10: 4994
<20: 9425
<30: 18184
<40: 29640
<50: 18283
<60: 10426
<70: 5396
<80: 2532
<90: 897
<100: 223

Upvotes: 0

user2004245

Reputation: 399

One method would not be the most precise method, but could be considered "good enough" depending on your needs.

The algorithm would be to pick a number between a min and a sliding max. There would be a guaranteed max g_max and a potential max p_max. Your true max would slide depending on the results of another random call. This will give you a skewed distribution you are looking for. Below is the solution in Python.

import random

def get_roll(min, g_max, p_max)

    max = g_max + (random.random() * (p_max - g_max))

    return random.randint(min, int(max))

get_roll(1, 10, 20)

Below is a histogram of the function ran 100,000 times with (1, 10, 20).

Upvotes: 0

user2314737

Reputation: 29397

A simple way to generate a random number with a given distribution is to pick a random number from a list where the numbers that should occur more often are repeated according with the desired distribution.

For example if you create a list [1,1,1,2,2,2,3,3,3,4] and pick a random index from 0 to 9 to select an element from that list you will get a number <4 with 90% probability.

Alternatively, using the distribution from the example above, generate an array [2,5,8,9] and pick a random integer from 0 to 9, if it's ≤2 (this will occur with 30% probability) then return 1, if it's >2 and ≤5 (this will also occur with 30% probability) return 2, etc.

Explained here: https://softwareengineering.stackexchange.com/a/150618

Upvotes: 8

pjs

Reputation: 19855

There are lots of ways to do so, all of which basically boil down to generating from a right-skewed (a.k.a. positive-skewed) distribution. You didn't make it clear whether you want integer or floating point outcomes, but there are both discrete and continuous distributions that fit the bill.

One of the simplest choices would be a discrete or continuous right-triangular distribution, but while that will give you the tapering off you desire for larger values, it won't give you independent control of the mean.

Another choice would be a truncated exponential (for continuous) or geometric (for discrete) distribution. You'd need to truncate because the raw exponential or geometric distribution has a range from zero to infinity, so you'd have to lop off the upper tail. That would in turn require you to do some calculus to find a rate λ which yields the desired mean after truncation.

A third choice would be to use a mixture of distributions, for instance choose a number uniformly in a lower range with some probability p, and in an upper range with probability (1-p). The overall mean is then p times the mean of the lower range + (1-p) times the mean of the upper range, and you can dial in the desired overall mean by adjusting the ranges and the value of p. This approach will also work if you use non-uniform distribution choices for the sub-ranges. It all boils down to how much work you're willing to put into deriving the appropriate parameter choices.

Upvotes: 0

SimaS

Reputation: 1

Using a drop table permit a very fast roll, that in a real time game matter. In fact it is only one random generation of a number from a range, then according to a table of probabilities (a Gauss distribution for that range) a if statement with multiple choice. Something like that:

num = random.randint(1,100)
if num<10 :
    case 1
if num<20 and num>10 :
    case 2
...

It is not very clean but when you have a finite number of choices it can be very fast.

Upvotes: 0

Jim

Reputation: 19582

You can keep a running average of what you have returned from the function so far and based on that in a while loop get the next random number that fulfills the average, adjust running average and return the number

Upvotes: 0

Generating Random Numbers for RPG games

Answers (12)

Related Questions