achtundachtzig
achtundachtzig

Reputation: 59

Generating 0 or 1 at random quickly

With python I am using random.randint(0,1) to generate a 0 or 1 at random. It is quite slow though because I need to do it about a billion times. Is there a quicker way?

total=0
n=1000000000
for x in range(n):
    money=2
    while True:
        x=random.randint(0,1)
        if x==1:
            break
        if x==0:
            money*=2
            continue
    total+=money
print(total/n)

Upvotes: 1

Views: 525

Answers (5)

Andrej Kesely
Andrej Kesely

Reputation: 195613

On Linux you can read from /dev/urandom:

from random import randint, getrandbits, random
from timeit import timeit
from operator import and_
from itertools import starmap

f_in = open('/dev/urandom', 'rb')

def generate_num1(n):
    return starmap(and_, zip(f_in.read(n), [1]*n))

def generate_num2(n):
    return (randint(0, 1) for _ in range(n))

def generate_num3(n):
    return (getrandbits(1) for _ in range(n))

def generate_num4(n):
    return (1 if random() < 0.5 else 0 for _ in range(n))

print(timeit(lambda: list(generate_num1(1024)), number=1000))
print(timeit(lambda: list(generate_num2(1024)), number=1000))
print(timeit(lambda: list(generate_num3(1024)), number=1000))
print(timeit(lambda: list(generate_num4(1024)), number=1000))

Prints (on my old laptop i3-3110, ubuntu 16.04):

0.11714126999140717
1.9653857139928732
0.20527600098284893
0.1918482400069479

UPDATE 27.2.2024 (AMD 5700x/Ubuntu 20.04/Python 3.11.7):

0.020467253983952105
0.2510357769788243
0.033986544993240386
0.05389478203142062

Each of these 3 options is significantly faster than random.randint. /dev/urandom seems fastest.

Upvotes: 3

Alain T.
Alain T.

Reputation: 42137

A slightly faster solution would be to use random.getrandbits(1)

By the way, if you're wondering if your money loop will converge to a particular value, it will not. The more sampling you do, the higher the average figure will get although it will vary a lot between runs, especially with a smaller number of loops.

Each iteration has 50% chance of returning 2, 25% chance of returning 4, 12.5% chance of returning 8, and so on ...

The mathematical expectancy can be expressed as a series (which does not converge):

∑ 2*1/2 + 4*1/4 + 8*1/8 .... == ∑ 2^p/2^p == 1 + 1 + 1 + ...

In other words a varying sum of 1s who's average will be higher and higher as you take more samples.

Since you're not making an infinite number of attempts, you will get an actual figure out of this. Each of the powers of 2 (p) in the above series will average out to 1 until 2^p > n. Then the expected values of the higher powers will start to drop.

For example, for 2^30 (roughly 1000000000), you will get an average of 1 for the first 30 powers of 2. You will then get 0.5 for 2^31, 0.25 for 2^32, and so on. This will add up to roughly 2 on top the first 30. So, in theory, the average for n = 2^30 should be near 32.

More generally, this should give an average of 2 + log(n,2) which does not progress very rapidly but will reach infinity nontheless.

For example:

  • 1 million --> log(1000000,2)+2 --> 21.9
  • 100 million --> log(100000000,2)+2 --> 28.6
  • 1 billion --> log(1000000000,2)+2 --> 31.9

There will be variation on small samples because outliers in the higher powers of two will randomly be hit and have a significant impact on the average. The impact of these outliers is less perceptible when the number of samples is larger.

If what you're looking for is a way to generate a value with an exponential distribution, you may want to try random.expovariate() and use its value as an exponent of 2.

Upvotes: 2

tripcode
tripcode

Reputation: 74

You can do this effectively with numpy. For more information on the numpy random functionality, see: (link). This returns a numpy array with the resulting random sequence.

In [1]: import numpy as np

        np.random.randint(low=0, high=1, size=1000000000)

Out[2]: array([0, 0, 0, ..., 0, 0, 0])

Alternatively, you could use randbits as follows:

In [1]: from random import getrandbits
        nums = [not random.getrandbits(1) for i in range(0,100000)]

Out[2]: [False,True,...,False]

Upvotes: 1

Yaakov Bressler
Yaakov Bressler

Reputation: 12168

Here's a solution using numpy + iterators:

I'm not sure what your speed is, but this execution takes 6 +/- 0.5 seconds.

import numpy as np

np.random.seed(11)
n=0
total=0
while n<10**9:
    # 10**5 seems to be the golden number here
    n_vals = np.random.randint(0,2, 10**5)
    n+=10**5

    # The prevalence of zeros shouldn't be diff than that of ones
    n_zeros =  len(n_vals) - np.sum(n_vals)
    # square the value of money
    total += n_zeros*2

print(total/n)
# 1.000063652

Upvotes: 0

Iguananaut
Iguananaut

Reputation: 23366

As already commented this will be somewhat slow in pure Python no matter what.

But the plain random.random() (which random.randint() is built on top of, albeit with surprising complexity) already returns a floating point value between 0.0 and 1.0 (or, more precisely, in the range [0.0, 1.0)), so if you want a random binary value with a reasonably stable distribution you can do something like

from random import random # Avoiding the module attribute access on each loop will save some time
...
if random() < 0.5:
    # do one thing
else:
    # do the other

I think even for 1e9 loops this will be acceptable. Otherwise you might try Cython.

Upvotes: 4

Related Questions