CODEWITHSUNDEEP

parsingdata-structurescompiler-constructionpostfix-notationinfix-notation

alfabeta

Reputation: 21

Why does a compiler have trouble parsing infix expressions compared to postfix expressions?

I was reading compiler design book and it said "compilers have problem solving infix expression because it has trouble with determining the priority of operands and you should always convert it to postfix then parse the expression" .

Why does a compiler have trouble parsing infix expressions compared to postfix expressions?

Upvotes: 1

Views: 1708

Answers (4)

Ira Baxter

Reputation: 95324

Compilers don't have any trouble parsing expressions in prefix, infix, or postfix order. Syntax is easy for compilers to handle.

You don't see a lot of compilers using prefix or postfix notation, though. That's because people aren't used to that. Pretty much only the Forth guys got away with postfix, and their compiler is almost trivial, which made it ideal for the very small machines in which it ran. Forth programmers learned to love postfix and got along just fine with a bit of experience.

[I don't know who told "you should always convert it to postfix then parse the expression" but that's nonsense].

Upvotes: 3

Stefan Haustein

Reputation: 18793

Infix parsing is quite straightforward with a recursive descent parser, as the short Java example below illustrates. Something similar can easily used for expression tree generation.

There are many things that may make sense to defer to different phases, but I have never seen a case where this reordering makes much sense as a transformation on the raw input before building the parse tree.

Maybe the book is just outdated? Which book is this quote from?

public class InfixProcessor {
  static final String[] OPERATORS = {"-+", "/*"};
  static StreamTokenizer tokenizer;

  static double process(String s) throws IOException {
    tokenizer = new StreamTokenizer(new StringReader(s));
    tokenizer.ordinaryChar('-');
    tokenizer.ordinaryChar('/');
    tokenizer.nextToken();
    return processInfix(0);
  }

  static double processInfix(int precedence) throws IOException {
    if (precedence >= OPERATORS.length) {
      return processPrimary();
    }
    double result = processInfix(precedence + 1);
    while (OPERATORS[precedence].indexOf((char) tokenizer.ttype) != -1) {
      int op = tokenizer.ttype;
      tokenizer.nextToken();
      double right = processInfix(precedence + 1);
      switch (op) {
        case '+': result += right; break;
        case '-': result -= right; break;
        case '*': result *= right; break;
        case '/': result /= right; break;
        default: throw new RuntimeException();
      }
    }
    return result;
  }

  static double processPrimary() throws IOException {
    if (tokenizer.ttype != StreamTokenizer.TT_NUMBER) {
      throw new RuntimeException("Number expected");
    }
    double result = tokenizer.nval;
    tokenizer.nextToken();
    return result;
  }

  public static void main(String[] args) throws IOException {
    BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
    while (true) {
      System.out.print("Expression> ");
      String expr = reader.readLine();
      if (expr == null || expr.isEmpty()) break;
      System.out.println("Result: " + process(expr));
    }
  }
}

Upvotes: 0

Reputation: 1155

You might consider it as a problem for the compiler to figure out where parentheses would go.

Consider two operators, * and +

if you have an infix notation, you can have statements like

X = A * B + C

which with ignorance of order of operations, a compiler might interpret as

X = ( A * B ) + C

or

X = A * ( B + C )

if written in post-fix notation there is no ambivilence. It is like the old HP calculators. There is a stack, an the operator pops two operands off the stack, and pushes the result back

So the first formula would look more like (ignoring the assignment to X, technically another operator)

A B * C +

while the second was

A B C + *

That said, your statement confused me a bit, because I thought that going from in-fix to post-fix was one intent of making a compiler, rather than a simple operation a compiler does

Upvotes: -1

Reputation: 1404

Consider a + b * c / d e. According to conventional precedence rules, this should be parsed as a + ((b * c) / d). Simply reading from right to left would yield ((a + b) * c)/d, which is not what you want. The compiler needs to know the precendence (and associativity) of every operator and take them into account.

On the other hand, postfix expressions have explicit precedence. For example, the first expression above is equivalent to b c * d / a +.

Not sure if I know what the author is getting at with "should always convert it to postfix then parse the expression", but that is the gist. (seems to me that converting to postfix requires parsing already).

Upvotes: 0

Related Questions