Reputation:
HI I have a function parse()
that takes a list of operators and operands (ex ['+', '20', '10']
) tokens, and an index i, and constructs a tree from them. When an operator ( a node with a .left
and .right
) is found, the function goes down until it reaches a literal ( a number ) or a letter variable. The problem is I would like the index i that is changing within the recursion to stay modified when the recursion on the left side is done and the function moves on to the right side. For example, to get this tree from [-, //, y, 2, x]:
I have written this:
def parse(tokens, i):
"""parse: tuple(String) * int -> (Node, int)
From an infix stream of tokens, and the current index into the
token stream, construct and return the tree, as a collection of Nodes,
that represent the expression."""
if tokens[i].isdigit():
return mkLiteralNode(tokens[i])
elif tokens[i] == "+":
return mkAddNode(parse( tokens, i + 1 ), parse( tokens, i + 2 )) # first argument is the node.left, second is the node.right
elif tokens[i] == "-":
return mkSubtractNode(parse( tokens, i + 1 ), parse( tokens, i + 2 ))
elif tokens[i] == "*":
return mkMultiplyNode(parse( tokens, i + 1 ), parse( tokens, i + 2 ))
elif tokens[i] == "//":
return mkDivideNode(parse( tokens, i + 1 ), parse( tokens, i + 2 ))
else:
return mkVariableNode(tokens[i])
When the SubtractNode is made, i is 0 and then increments normally, but when the left side is done i is again 0 and the parse(tokens, i + 2)
points to the y instead of the x, making something like this:
How is it possible to make the above tree without using pop()
in tokens?
Upvotes: 3
Views: 3629
Reputation: 1101
It's much easier to write such things, when you think of your tokens
list as of an object, that is itself responsible for storing its position. That's why in general, the tokens
should come from a Lexer, which usually just has one method: next_token
. We'll simulate this with an iterator:
def parse(tokens):
"""parse: tokens_iter or generator -> Node
From an infix stream of tokens, and the current index into the
token stream, construct and return the tree, as a collection of Nodes,
that represent the expression."""
next_tok = next(tokens)
if next_tok.isdigit():
return ('literal', next_tok)
elif next_tok == "+":
return ('add', parse( tokens ), parse( tokens )) # first argument is the node.left, second is the node.right
elif next_tok == "-":
return ('sub', parse( tokens ), parse( tokens ))
elif next_tok == "*":
return ('mul', parse( tokens ), parse( tokens ))
elif next_tok == "//":
return ('div', parse( tokens ), parse( tokens ))
else:
return ('variable', next_tok )
# And, example:
print(parse(iter(['-', '//', 'y', '2', 'x'])))
That will print:
('sub', ('div', ('variable', 'y'), ('literal', '2')), ('variable', 'x'))
You'll also need to handle the StopIteration
exception, and turn it into a meaningful ParseError
.
Upvotes: 3
Reputation: 281958
Have parse
return the index of the next unused token as well as the parse tree, and use that to determine where to continue parsing. For example:
left, rightIndex = parse(tokens, i+1)
right, returnIndex = parse(tokens, nextIndex)
return mkWhateverNode(left, right), returnIndex
Upvotes: 0