Reputation: 1304
I have a list of numbers and I'm trying to write a function that will choose n random indices i such i's likelihood is percentages[i]
Function:
def choose_randomly(probabilities, n):
percentages = accumulated_s(probabilities)
result = []
for i in range(n):
r = random()
for j in range(n):
if r < percentages[j]:
result = result + [j]
return result
accumulated_s will just generate a corresponding list of probabilities.
I'm expecting results like this:
choose_randomly([1, 2, 3, 4], 2) -> [3 3 0]
choose_randomly([1, 2, 3, 4], 2) -> [1 3 1]
The problem is that this is not returning n indicies. Can anyone point out what I'm doing wrong? Thank you so much!
Upvotes: 1
Views: 68
Reputation: 155438
Once you've found the right range of probabilities, you're done; break
out of the inner loop to generate the next value, or you'll act as if all probabilities above the correct threshold were matched as well:
# Enumerate all percentages, not just first n
for j, pct in enumerate(percentages):
if r < pct:
result.append(j) # Don't create tons of temporary lists; mutate in place
break # <-- Don't add more results
Also note, if you have a lot of values in the set of probabilities, it may make sense to use functions from the bisect
module to find the correct value, rather than scanning linearly each time; for a small number of entries in percentages
, linear scanning is fine, but for a large number, O(log n)
lookups may beat O(n)
scans.
Upvotes: 1