Numpy random choice with probabilities to produce a 2D-array with unique rows

Question

Similar to Numpy random choice to produce a 2D-array with all unique values, I am looking for an efficient way of generating:

n = 1000
k = 10
number_of_combinations = 1000000

p = np.random.rand(n)
p /= np.sum(p)

my_combinations = np.random.choice(n, size=(number_of_combinations, k), replace=False, p=p)

As discussed in the previous question, I want this matrix to have only unique rows. Unfortunately, the provided solutions do not work for the additional extension of using specific probabilities p.

My current solution is as follows:

my_combinations = set()

while len(my_combinations) < number_of_combinations:
    new_combination = np.random.choice(n, size=k, replace=False, p=p)
    my_combinations.add(frozenset(new_combination))

print(my_combinations)

However, I do think that there should be a more efficient numpy approach to solve this faster.

Numpy random choice with probabilities to produce a 2D-array with unique rows

Answers (1)

Related Questions