Weighted random numbers in Python from a list of values

Question

I am trying to create a list of 10,000 random numbers between 1 and 1000. But I want 80-85% of the numbers to be the same category( I mean some 100 numbers out of these should appear 80% of the times in the list of random numbers) and the rest appear around 15-20% of the times. Any idea if this can be done in Python/NumPy/SciPy. Thanks.

Divakar · Accepted Answer

Here's an approach -

a = np.arange(1,1001) # Input array to extract numbers from

# Select 100 random unique numbers from input array and get also store leftovers
p1 = np.random.choice(a,size=100,replace=0)
p2 = np.setdiff1d(a,p1)

# Get random indices for indexing into p1 and p2
p1_idx = np.random.randint(0,p1.size,(8000))
p2_idx = np.random.randint(0,p2.size,(2000))

# Index and concatenate and randomize their positions
out = np.random.permutation(np.hstack((p1[p1_idx], p2[p2_idx])))

Let's verify after run -

In [78]: np.in1d(out, p1).sum()
Out[78]: 8000

In [79]: np.in1d(out, p2).sum()
Out[79]: 2000

Weighted random numbers in Python from a list of values

Answers (2)

Related Questions