Marcel Sonderegger
Marcel Sonderegger

Reputation: 820

python3 bifurcated generator

I am looking for a code to copy the generator and then continue with the new generator. It is like a bifurcation of a generator.

def Generator():
    myNumbers=range(3)
    for i in myNumbers:
        yield i

for i in Generator():
    bifurcatedGenerator = Generator
    for j in bifurcatedGenerator():
        print (i, j)

this code gives as output:

0 0
0 1
0 2
1 0
1 1
1 2 <- wrong
2 0
2 1 <- wrong
2 2 <- wrong

whereas the disiered output should be: (The bifurcated generator needs to be a new instance, but continue at the same point as the old generator stopped.)

0 0
0 1
0 2
1 1
1 2
2 2

The application itself is much more complicated, this here is just a code example.

Important (only for myself) is a semanticly beautiful solution which is nicely readable to third parties.Efficiency is not so important

Upvotes: 1

Views: 103

Answers (2)

berna1111
berna1111

Reputation: 1861

Why not use a generator with a start parameter (and a stop one while you are at it)?

def Generator(start=0, stop=3):
    for i in range(start, stop):
        yield i

for i in Generator():
    for j in Generator(start=i):
        print (i, j)

Also gives the output:

0 0
0 1
0 2
1 1
1 2
2 2

Upvotes: 2

Olivier Melan&#231;on
Olivier Melan&#231;on

Reputation: 22324

Some people will tell you to use itertools.tee. Do not use itertools.tee.

Use a list

To keep track of the previous states of your generator, you need to store previously yielded values in a list. This is what the function itertools.tee does when it copies a generator.

Unfortunately, this removes all memory-advantage of using a generator. So you are better to use a list.

def generator():
    yield from range(3)

lst = list(generator())

for i in range(len(lst)):
    for j in range(i, len(lst)):
        print(lst[i], lst[j])

Output:

0 0
0 1
0 2
1 1
1 2
2 2

Why not using itertools.tee then?

It is still possible to use itertools.tee, but you should not.

from itertools import tee

def generator():
    yield from range(3)

lst = list(generator())

main_gen, bif_gen = tee(generator())

for i in main_gen:
    for j in bif_gen:
        print(i, j)
    _, bif_gen = tee(main_gen) # Yes, you *must* use the second item here

The reason the previous code works is subtle and is actually linked to the fact that itertools.tee returns the same tee object as first output value when given a tee object. This is why the second generator should be used.

This, coupled to the fact that the doc explicitly specifies that a list is better in this situation, demonstrates that the first solution must be preferred:

This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored). In general, if one iterator uses most or all of the data before another iterator starts, it is faster to use list() instead of tee().

Upvotes: 1

Related Questions