Michael
Michael

Reputation: 7809

Randomly generate integers with a distribution that prefers low ones

I have an list ordered by some quality function from which I'd like to take elements, preferring the good elements at the beginning of the list.

Currently, my function to generate the random indices looks essentially as follows:

def pick():
    p = 0.2
    for i in itertools.count():
        if random.random() < p:
            break
    return i

It does a good job, but I wonder:

  1. What's the name of the generated random distribution?
  2. Is there a built-in function in Python for that distribution?

Upvotes: 3

Views: 806

Answers (3)

Severin Pappadeux
Severin Pappadeux

Reputation: 20080

You could simulate it via exponential, but this is like making square peg fit round hole. As Mark said, it is geometric distribution - discrete, shifted by 1. And it is right here in the numpy:

import numpy as np
import random
import itertools
import matplotlib.pyplot as plt

p = 0.2

def pick():
    for i in itertools.count():
        if random.random() < p:
            break
    return i

q = np.random.geometric(p, size = 100000) - 1
z = [pick() for i in range(100000)]

bins = np.linspace(-0.5, 30.5, 32)

plt.hist(q, bins, alpha=0.2, label='geom')
plt.hist(z, bins, alpha=0.2, label='pick')
plt.legend(loc='upper right')

plt.show()

Output:

enter image description here

Upvotes: 2

James
James

Reputation: 36608

What you are describing sounds a lot like the exponential distribution. It already exists in the random module.

Here is some code that takes just the integer part of sampling from an exponential distribution with a rate parameter of 100.

import random
import matplotlib.pyplot as plt

d = [int(random.expovariate(1/100)) for i in range(10000)]
h,b = np.histogram(d, bins=np.arange(0,max(d)))

plt.bar(left=b[:-1], height=h, ec='none', width=1))
plt.show()

enter image description here

Upvotes: 3

ividito
ividito

Reputation: 336

random.random() defaults to a uniform distribution, but there are other methods within random that would also work. For your given use case, I would suggest random.expovariate(2) (Documentation, Wikipedia). This is an exponential distribution that will heavily prefer lower values. If you google some of the other methods listed in the documentation, you can find some other built-in distributions.

Edit: Be sure to play around with the argument value for expovariate. Also note that it doesn't guarantee a value less than 1, so you might need to ensure that you only use values less than 1.

Upvotes: 1

Related Questions