Hadar
Hadar

Reputation: 678

trying to understand recursive run of generators

I'm trying to figure out how to plot the run of this code onto a recursion tree, because im not quite sure how it operates, even when I'm debugging. what each yield is doing and why do I need them both?

Ive tried creating a somewhat tree that connects each run to its next recursively but I dont know what follows yield.data where the head is 'c'

class Node:
    def __init__(self, data, next=None):
        self.data = data
        self.next = next


def get_reverse_iterator(head):
    if head.next:
        for datum in get_reverse_iterator(head.next):
            yield datum
    yield head.data


lst = Node('a', Node('b', Node('c')))
for x in get_reverse_iterator(lst):
    print(x)

the result should be: c b a

Upvotes: 1

Views: 143

Answers (2)

Booboo
Booboo

Reputation: 44313

To understand how it works you need to understand the basic idea of recursion. Let's suppose that we are not dealing with a generators; we just wish to print all the nodes of a list in reverse given the head node. We call the function print_reverse passing the node as the argument. If the node's next field is empty we just print the field's data value. But if next is not empty, it is pointing to a node that must be printed before the current node is printed. So we recursively call print_reverse again to first print that node. When print_reverse returns we can now print the current node. Of course, when we call print_reverse recursively to print the next node, it may discover that there is yet another node that it points to which must first be printed and we will be calling print_reverse recursively yet again. So we have:

class Node:
    def __init__(self, data, next=None):
        self.data = data
        self.next = next


def print_reverse(head):
    if head.next:
        print_reverse(head.next)
    print(head.data)


lst = Node('a', Node('b', Node('c')))
print_reverse(lst)

The above code must be understood before the generator problem can be understood. Instead of creating a function print_reverse that prints the node's data field, we wish instead to create a generator function that yields the value. So, it makes sense to rename the function and to replace the print function with a yield statement and the recursive call with a yield from statement:

class Node:
    def __init__(self, data, next=None):
        self.data = data
        self.next = next


def get_reverse_iterator(head):
    if head.next:
        #print_reverse(head.next)
        yield from get_reverse_iterator(head.next)
    #print(head.data)
    yield head.data

lst = Node('a', Node('b', Node('c')))

Now we can use the generator as in:

for x in get_reverse_iterator(lst):
    print(x)

or:

l = [x in get_reverse_iterator(lst)]

But an alternative to using recursion that avoids creating multiple generator objects, would be:

def get_reverse_iterator(head):
    stack = []
    while head.next:
        stack.append(head)
        head = head.next
    yield head.data
    while len(stack):
        head = stack.pop()
        yield head.data

Upvotes: 1

Green Cloak Guy
Green Cloak Guy

Reputation: 24711

Whenever you call the method as a generator (e.g. for x in get_reverse_iterator()), python starts executing that method line by line. Whenever it hits a yield, it stops cold and returns that. When it gets asked for a next() value in the next iteration of the for loop, it continues to execute.

This looks like a fairly straightforward linked-list-traversal idiom, where each element of the list contains data that is itself a list (or some other iterable value, like a string):

list[0].data = [1, 2, 3, 4]
list[1].data = [5, 6, 7, 8]
...
list[9].data = [37, 38, 39, 40]

So what the code is doing here is printing those sub-lists from the back of the main list to the front of the main list. The output should look something like this:

37 38 39 40 33 34 35 36 ... 5 6 7 8 [1, 2, 3, 4]

which becomes evident when you look at how the code executes. I'll rewrite it in words:

func get_reverse_iterator(head) {
    if head isn't the last element of the list, then
        call this function on the next element of the list (head.next)
        for every element in the return value of that,
            yield that element
    yield this element's data

The 'base case' is the last element of the list, which doesn't have a .next. So its data, which is iterable, gets returned to the second-to-last element. The second-to-last element yields every element of that data in turn, and then returns its own data to the third-to-last element. The third-to-last element yields every element of that data in turn, and so on, until finally you get to the first element of the list. Every single yield statement thus far has passed one element up the chain, recursively, and so that inner for loop for the first element has yielded 36 values so far. Finally, all the later elements in the list are done passing values through, and so the first element gets to the last statement of the function and yields its own data.

But there's nothing left to catch that yielded data and parse it by individual element, so it gets printed as the list it was in the first place. Or, at least, that is for my example presented above.


In your case, it's more straightforward, because when you iterate over a string each item is still a string. But it's the same thing on a smaller scale:

  1. get_reverse_iterator() is called on the root node of lst
  2. The root node (I'll call it NodeA) has a .next
  3. get_reverse_iterator() is called on the next node, which I'll call NodeB
  4. NodeB has a .next
  5. get_reverse_iterator() is called on the next node, which I'll call NodeC
  6. NodeC does not have a .next
  7. get_reverse_iterator(NodeC) skips the for loop and yields NodeC.data, which is 'c'`
  8. get_reverse_iterator(NodeB) catches 'c' inside the for loop and yields it
  9. get_reverse_iterator(NodeA) catches 'c' inside the for loop and yields it
  10. 'c' gets assigned to x, and it gets printed.
  11. The next iteration of the outer loop happens, and execution returns to get_reverse_iterator(NodeB)
  12. The for loop ends, because get_reverse_iterator(NodeC) has stopped yielding things
  13. get_reverse_iterator(NodeB) ends the for loop, exits the if block, and finally yields NodeB.data, which is 'b'
  14. get_reverse_iterator(NodeA) catches 'b' inside the for loop and yields it
  15. 'b' gets assigned to x, and it gets printed.
  16. The next iteration of the outer loop happens, and execution returns to get_reverse_iterator(NodeA)
  17. The for loop ends, because get_reverse_iterator(NodeC) has stopped yielding things
  18. get_reverse_iterator(NodeA) ends the for loop, exits the if block, and finally yields NodeA.data, which is 'a'
  19. 'a' gets assigned to x, and it gets printed
  20. The outer for loop finishes, as get_reverse_iterator(NodeA) has stopped yielding things.

Upvotes: 0

Related Questions