Reputation: 192
Suppose I would like to write a fairly simple programming language, and I want to implement operators such like 2 + 3 * 2 = 8
What is the general way to implement things like this?
Upvotes: 2
Views: 861
Reputation: 175655
I'm not sure how much detail you're interested in, but it sounds like you're looking to implement a parser. There's typically two steps:
The lexer reads over the text and converts it to tokens. For example, it might read "2 + 3 * 2" and convert it to INTEGER
PLUS
INTEGER
STAR
INTEGER
The parser reads in the tokens and tries to match them to rules. For example, you might have these rules:
Expr := Sum | Product | INTEGER;
Sum := Expr PLUS Expr;
Product := Expr STAR Expr;
It reads the tokens and tries to apply the rules such that the start rule maps to the tokens its read in. In this case, it might do:
Expr := Sum
Expr := Expr PLUS Expr
Expr := INTEGER(2) PLUS Expr
Expr := INTEGER(2) PLUS Product
Expr := INTEGER(2) PLUS Expr STAR Expr
Expr := INTEGER(2) PLUS Integer(3) STAR Expr
Expr := INTEGER(2) PLUS Integer(3) STAR Integer(2)
There are many types of parsers. In this example I read from left to right, and started from the initial expression, working down until I'd replaced everything with a token, so this would be an LL parser. As it does this replacement, it can generate an abstract syntax tree that represents the data. The tree for this might look something like:
You can see that the Product rule is a child of the Sum rule, so it will end up happening first: 2 + (3 * 2)
. If the expression had been parsed differently we might've ended up with this tree:
Now we're calculating (2 + 3) * 2
. It all comes down to which way the parser generates the tree
If you actually want to parse expressions, odds are you don't want to write the parser by hand. There are parser generators that take a configuration (called a grammar) similar to the one I used above, and generate the actual parser code. Parser generators will let you specify which rule should take priority, so for example:
Expr := Sum | Product | INTEGER;
Sum := Expr PLUS Expr; [2]
Product := Expr STAR Expr; [1]
I labeled the Product rule as priority 1, and Sum as priority 2, so given the choice the generated parser will favor Product. You can also design the grammar itself such that the priority is built-in (this is the more common approach). For example:
Expr := Sum | INTEGER;
Sum := Expr PLUS Product;
Product := Term STAR INTEGER;
This forces the Products to be under the Sums in the AST. Naturally this grammar is very limited (for example, it wouldn't match 2 * 3 + 2), but a comprehensive grammar can be written that still embeds an order of operations automatically
Upvotes: 11
Reputation: 45552
You would need to write a parser for your fairly simple programming language. If you want to do this in Python, start by reading Ned Batchelder's blog post Python Parsing Tools.
Upvotes: 4