TheSaxo
TheSaxo

Reputation: 17

Python - next method not working properly with generator

I made a class in python that splits a stream of code in tokens and advances token by token to work with them

import re

class Tokenizer:

    def __init__(self, input_file):
        self.in_file = input_file
        self.tokens = []
        self.current_token = None
        self.next_token = None
        self.line = 1

    def split_tokens(self):
        ''' Create a list with all the tokens of the input file '''
        self.tokens = re.findall("\w+|[{}()\[\].;,+\-*/&|<>=~\n]", self.in_file)

    def __iter__(self):
        for token in self.tokens:
            if token != '\n':
                yield token 
            else:
                self.line += 1

    def advance(self):
        self.current_token = self.next_token
        self.next_token = next(self.__iter__())

After initialization:

text = 'constructor SquareGame03 new()\n\
       {let square=square;\n\
       let direction=direction;\n\
       return square;\n\
       }'

t = Tokenizer(text)
t.split_tokens()
t.advance()

It seems to work if i print the tokens

print(t.current_token, t.next_token)
None constructor

but every other call of the advance method give those results:

t.advance()
print(t.current_token, t.next_token)
constructor constructor
t.advance()
print(t.current_token, t.next_token)
constructor constructor

So it's not advancing and i can't understand why.

Upvotes: 0

Views: 1051

Answers (1)

Dummmy
Dummmy

Reputation: 798

In this case, .__iter__ is implemented as a generator function (instead of a generator iterator) which returns a generator iterator.

Every time Tokenizer.advance is called, a new generator iterator is created and returned by .__iter__. Instead, an iterator should be stored by a Tokenizer object at the initialization stage for all subsequent usage.

For example:

import re

class Tokenizer:

    def __init__(self, input_file):
        self.in_file = input_file
        self.tokens = []
        self.current_token = None
        self.next_token = None
        self.line = 1

    def split_tokens(self):
        ''' Create a list with all the tokens of the input file '''
        self.tokens = re.findall("\w+|[{}()\[\].;,+\-*/&|<>=~\n]", self.in_file)
        self.iterator = self.__iter__()

    def __iter__(self):
        for token in self.tokens:
            if token != '\n':
                yield token 
            else:
                self.line += 1

    def advance(self):
        self.current_token = self.next_token
        self.next_token = next(self.iterator)

Another minimal example that may explain:

def fib():
    a = 0
    b = 1
    while True:
        yield b
        a, b = b, a + b

# 1, 1, 2, ...
fibs = fib()
next(fibs)
next(fibs)
next(fibs)

# 1, 1, 1, ...
next(fib())
next(fib())
next(fib())

By the way, I cannot see the reason to mixed the usage of a .__iter__ magic method and a separate .advance method. It might introduce some confusion.

Upvotes: 1

Related Questions