Reputation: 8467
I have two nested loops, the outer one using an infinite generator, and the inner a finite iterator (a list of numbers):
for j in itertools.count():
for q in Q:
...
Under a given condition, I need to break
out of both loops, which makes the code inelegant (I have to set a flag to trigger the second break
). To avoid that, I thought I'd combine the two loops into one using
for j, q in itertools.product(itertools.count(), Q):
...
But when I did that, my computer got slower and slower, apparently swapping memory, until I had to kill the Python process.
The documentation of product
says
the actual implementation does not build up intermediate results in memory
so I assumed the result of product
to be a generator, which creates elements of the product on demand. But the slowing down and swapping speaks against that.
What did I wrong? How can I achieve what I wanted, a single loop combining an infinite generator with a list?
Upvotes: 0
Views: 214
Reputation: 13049
Adding in case it is useful to anyone, and based on https://stackoverflow.com/a/8420092/13596037, exceptions can be used to break or continue an outer loop, for example:
class ContinueOuter(Exception): pass
class BreakOuter(Exception): pass
try:
for i in range(10): # loop we want to break or continue from
try:
for j in range(3):
print(i, j)
if (i, j) == (3, 1):
raise ContinueOuter
elif (i, j) == (4, 0):
raise BreakOuter
except ContinueOuter:
pass
except BreakOuter:
pass
Upvotes: 1
Reputation: 280181
Your documentation quote comes from the following context:
This function is roughly equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:
def product(*args, repeat=1): # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111 pools = [tuple(pool) for pool in args] * repeat result = [[]] for pool in pools: result = [x+[y] for x in result for y in pool] for prod in result: yield tuple(prod)
The intermediate results are all the result
lists. The actual itertools.product
implementation doesn't need to build those. It does, however, build something analogous to pools
.
itertools.product
needs to be able to iterate over its inputs repeatedly, so it materializes all inputs into tuples. This requirement doesn't apply to the first input when repeat
is 1
, but the implementation materializes all inputs anyway, perhaps for consistency or ease of implementation. itertools.count()
cannot be fully materialized, hence the results you saw.
One option you have is a generator expression. The following loop:
for j, q in ((j, q) for j in itertools.count() for q in Q):
will behave like you wanted itertools.product
to behave, though it will be slower than plain nested loops (because of the overhead of the generator). The performance penalty is just a constant factor, and it doesn't slow down the loop body, so it's rarely a problem.
If you can factor out the nested loops into a function, you can also use return
instead of break
:
def helper_function(stuff):
for j in itertools.count():
for q in Q:
...
if whatever:
return
...
Upvotes: 1
Reputation: 32244
You can create a generator using a comprehension. Comprehensions support iterating over multiple iterables at once using multiple for
s
for j, q in ((j, q) for j in itertools.count() for q in Q):
...
Upvotes: 1