Reputation: 500267
From time to time I find myself writing recursive generators in Python. Here is a recent example:
def comb(input, lst = [], lset = set()):
if lst:
yield lst
for i, el in enumerate(input):
if lset.isdisjoint(el):
for out in comb(input[i+1:], lst + [el], lset | set(el)):
yield out
for c in comb([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]):
print c
The details of the algorithm are not important. I include it as a complete, real-world illustration to give the question some context.
My question is about the following construct:
for out in comb(...):
yield out
Here, comb()
is the recursive instantiation of the generator.
Every time I have to spell out the for: yield
loop, it makes me cringe. Is this really the way to write recursive generators in Python, or are there superior (more idiomatic, more performant, etc) alternatives?
Upvotes: 5
Views: 1038
Reputation: 365697
Every time I have to spell out the for: yield loop, it makes me cringe. Is this really the way to write recursive generators in Python, or are there superior (more idiomatic, more performant, etc) alternatives?
There is a superior alternative:
yield from comb(...)
This does effectively the same thing as:
for out in comb(...):
yield out
This requires Python 3.3. If you're sticking with Python 2.x (or older 3.x), you have to stick with the old way, because Python 2's syntax will never be updated again after 2.7 (and 3.0 through 3.2 are obviously just as frozen).
First, see the pure Python yield from that Wessie mentioned in the comments. This version only works with a single level of "yield from", but there's a link at the bottom to a more flexible and optimized (but harder to understand) version. It doesn't seem to actually work (I get a NameError
on _stack
, but it looks like it should be easy to fix. If so, and if it's acceptable to put a @supergenerator
decorator on the outermost generator, and if the performance is acceptable, there's your answer.
If not, there are various tricks you can do to handle multiple levels of yield-looping all in one place, instead of at each level. However, none of them will get you down to 0 levels—and really, they're rarely worth doing. For example:
Once you think in terms of sequences instead of generator functions, it's pretty clear that all we're trying to do is flatten a sequence. Whether you're trying to flatten N levels, flatten until reaching a non-iterable, flatten until satisfying some other predictable, etc., there's a simple algorithm for that; you just have to pick the right one. But will it make your code more idiomatic, readable, performant, etc.? Rarely. Let's take a super simple case.
def flatten(seq, levels=1):
for level in range(levels):
seq = itertools.chain.from_iterable(seq)
return seq
So:
def a():
yield 1
yield 2
yield 3
def b():
yield a()
def c():
yield b()
def d():
yield c()
for i in flatten(d(), 3):
print i
The benefit is that I only had to deal with the nesting in one place, at the call site, instead of in 3 places, at every generator along the way. The cost is that it's less obvious what's going on to the reader, and much easier to get something wrong. (Well, not so much in this case… but imagine flattening until lambda x: isinstance(list)
, testing the hell out of it, releasing it, and then someone calls comb
on a tuple
…) The cure is worse than the disease, which is why I called it a trick.
Unless the flattening really is a natural part of the algorithm, or some of the in-between steps are code that you can't or don't want to touch, or structuring things this way is a useful illustration or reminder of something, or…
Just for fun, I wrote an all-singing-all-dancing flatten-any-way-you-want function and submitted it as a patch to Erik Rose's nifty more-itertools library. Even if he doesn't accept it, you can find it in my fork—it's called collapse
, and it's the last function in the file.
Upvotes: 8