Reputation: 33
I am working in generating an (1109, 8) array with random values generated from a fixed set of numbers [18, 24, 36, 0], I need to ensure each row contains 5 zeros at all times, but it wasn't happening even after adjusting the weightings for probabilities.
My workaround code is below but wanted to know if there is an easier way with another function? or perhaps by adjusting some of the parameters of the generator? https://numpy.org/doc/stable/reference/random/generator.html
#Random output using new method
from numpy.random import default_rng
rng = default_rng(1)
#generate an array with random values of test duration,
test_duration = rng.choice([18, 24, 36, 0], size = arr.shape, p=[0.075, 0.1, 0.2, 0.625])
# ensure number of tests equals n_tests
n_tests = 3
non_tested = arr.shape[1] - n_tests
for row in range(len(test_duration)):
while np.count_nonzero(test_duration[row, :]) != n_tests:
new_test = rng.choice([18, 24, 36, 0], size = arr.shape[1], p=[0.075, 0.1, 0.2, 0.625])
test_duration[row, :] = np.array(new_test)
else:
pass
print('There are no days exceeding n_tests')
#print(test_durations)
print(test_duration[:10, :])
Upvotes: 1
Views: 250
Reputation: 36598
If you need 5 zeros in every row, you can just randomly select 3 values from [18, 24, 36]
, pad the rest with zeros and then do a per-row random shuffle. The numpy shuffle happens in-place, so you don't need to reassign.
import numpy as np
c = [18,24,26]
p = np.array([0.075, 0.1, 0.2])
p = p / p.sum() # normalize the probs
a = np.random.choice(c, size=(1109, 3), replace=True, p=(p/p.sum()))
a = np.hstack([a, np.zeros((1109, 5), dtype=np.int32)])
list(map(np.random.shuffle, a))
a
# returns:
array([[ 0, 0, 0, 0, 36, 0, 36, 36],
[ 0, 36, 0, 24, 24, 0, 0, 0],
[ 0, 0, 0, 0, 36, 36, 36, 0]])
...
[ 0, 0, 0, 24, 24, 36, 0, 0],
[ 0, 24, 0, 0, 0, 36, 0, 18],
[ 0, 0, 0, 36, 36, 24, 0, 0]])
Upvotes: 1
Reputation: 594
You could simply create a random choice for the 5 positions of the zeros in the array, this way you would enforce that there are indeed 5 zeros, and after you sample the [18, 24, 36] with their normalized probabilities.
But by doing this you are not respecting the probability density that you specified in the first place, I don't know in which application you're using this for but this is a point to consider.
Upvotes: 0