Faster alternatives to using numpy.random.choice in Python?

Question

My goal is to generate a large 2D array in Python where each number is either a 0 or 1. To do this, I created a nested for-loop as shown below:

    for count in range(0,300):
      block = numpy.zeros((8,300000))

      for a in range(0,8):
        for b in range(0,300000):
          block[a][b] = numpy.random.choice(2,1, p=[0.9,0.1])

The block has a 90% chance of picking a "0" and a 10% of picking a "1". But it takes over 1 minute for the outer for loop to process once. Is there a more efficient way to pick random numbers for a large number of arrays while stilling being able to use the "P" values? (This is my first post so sorry if the formatting is broken)

user2357112 · Accepted Answer

The idea behind NumPy is to not loop through 720000000 iterations at Python level. You're supposed to use whole-array operations, like having numpy.random.choice generate an entire array of choices in one call:

block = numpy.random.choice(2, size=(8, 300000), p=[0.9, 0.1])

This completes almost instantly.

Faster alternatives to using numpy.random.choice in Python?

Answers (1)

Related Questions