SubstantialRange
SubstantialRange

Reputation: 53

Generating random natural numbers with higher probability for lower numbers?

I'm looking for a function like Python's random.randint() which generates random whole numbers between a and b, but one which is more likely to generate numbers closer to a, with only a few closer to b.

Is there a function that does that?

Upvotes: 5

Views: 3269

Answers (2)

Peter O.
Peter O.

Reputation: 32878

Your question is vague as there are numerous random distributions in which lower numbers are more likely than higher numbers. Also, saying "between a and b" is likewise vague here. Here is one of many examples, which produces a random integer in the closed interval [a, b] in the manner you're asking for:

min(random.randint(a, b), random.randint(a, b))

And here's another:

min(random.randint(a, b), random.randint(a, b), random.randint(a, b))

With more and more random.randint(a, b), their minimum tends to be more and more concentrated towards the lower end of the range.

The user "pjs" wrote the following comment:

Both of those can be generalized to the same form, the minimum of k order statistics, which can be generated using a single random number and then scaled to the correct range:int(math.floor(a + (b - a + 1) * (1.0 - random.random()**(1.0 / k)))). When k == 2 this has a triangle distribution, and for higher values of k it becomes more and more heavily weighted towards a. Basing it on a single random number also makes this method amenable to common random numbers or antithetic random numbers if you want to play games with "variance reduction" strategies in Monte Carlo sims.

However, there are issues with this formula.

  • For one, there are issues of accuracy: the expression random.random()**(1.0 / k) is ill-conditioned near 1 and approaches 1 for large k, so that in common floating-point arithmetic which is coarser from 1/2 to 1 than from 0 to 1/2, "there could be an accuracy problem" (Devroye, 1986, Non-Uniform Random Variate Generation, page 675).
  • Second, it's rather inelegant to invoke floating-point numbers just to output random integers in the end — after all, computers generate random floating-point variates by transforming integers, not the other way around.
  • And finally, we should generally not be concerned about efficiency (performance) between random variate generation methods, unless we have used them in an application, measured the running time, and found the running time to be unacceptable. This is a general programming issue known as "premature optimization". Although my approach in this answer may use many random variates, it's convenient, especially since we can rewrite it as:min(random.randint(a, b) for i in range(k)) for some integer k greater than 0.

Upvotes: 6

pjs
pjs

Reputation: 19855

Yes, there are an infinite number of functions that do that. The sole requirements to be a legitimate discrete probability distribution are that p(x) ≥ 0 for all x in the range [a, b], and sum(p(x)) = 1. Consequently, any g(x) which is non-negative over the range [a, b] and has g(x) < ∞ for all a ≤ x ≤ b can be converted to a valid distribution by finding total = sum(g(x)) from a to b, and scaling to get p(x) = g(x) / total.

Pick any function g(x) as described above that is decreasing in x (there's an infinite number of them for you to choose from), scale as described, and generate from the resulting p(x). There are several ways to generate from discrete distributions once you have a table of p(x)'s, such as discrete inversion or Walker's alias method.

Upvotes: 2

Related Questions