Reputation: 11648
I need to iterate over a list an arbitrary number of times, yield
ing each element of the list in random order (a different order each time I iterate through the full list). I need to yield each element once before yielding that element a second time, yield each element twice before yielding that element a third time, etc.
Currently, my code looks like this:
def random_yield(data):
random.shuffle(data)
data_index = 0
while True:
yield data[data_index]
data_index += 1
if data_index == len(data):
random.shuffle(data)
data_index = 0
Is there a way to do this more efficiently so I don't pay the performance penalty of the random.shuffle()
after every len(data)
yield
s?
Upvotes: 0
Views: 165
Reputation: 241691
You can do a Fisher-Yates shuffle one step per iteration, thereby distributing the cost evenly over each iteration. That's not more efficient -- in fact, it is probably less efficient, given that the library function is probably faster than Python code -- but it avoids long pauses.
The code is not significantly different from just grabbing a random element each time. The only difference is that you grab the random elements from a subset of the vector:
from random import randrange
def random_yield(data):
index = 0
limit = len(data)
while True:
if index + 1 >= limit:
yield data[index]
index = 0
else:
# Get a random element which we haven't yet used this cycle
# (This is a single iteration of the F-Y shuffle algorithm)
j = randrange(index, limit)
rv = data[j]
yield rv
# Swap the element we just selected so its not in the next subrange
data[j] = data[index]
data[index] = rv
index += 1
Upvotes: 3