Felipe
Felipe

Reputation: 697

how to validate infix expressions with parenthesis?

I'm struggle with a problem that asks to parse an infix expression to postfix. But firstly i need to check if the infix expression is correct. How to do that ?

In the input, are valid:

Table of operators

A possible solution that i have seen is to parse from infix to postfix firstly and check if the postfix expression is valid.

I've tried to use the following algorithm for parse infix-postfix

1) read every character in the expression from left to right and create a empty stack
2) if the current character is an operand put on the output
3) if the current character is an operator and stack is empty; push it to stack.
4) if the current character is an operator and stack is not empty; while the operator on top of stack has higher priority then the current operator put the top of stack on the output and pop it; in the end push on stack the current character;
5) if the current character is '(' push it to stack.
6) if the current character is ')' pop the stack and put to output; until find a '('.

7)At the end of the input; if stack is not empty; pop all elements and put it on the output.

But this algorithm does not work in cases like this : "(ab+c)/2(e+a)". Because it will construct a valid postfix expression. But this expression is not a valid infix expression.

Here is my code :

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintStream;
import java.util.Stack;

public class Main {

    public static boolean isOperand(char c) {
        return ('0' <= c && c <= '9') || ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z');
    }

    public static int getThePrecedence(char c){
        if(c == '^')
            return 6;
        if(c == '*' || c == '/')
            return 5;
        if(c == '-' || c == '+')
            return 4;
        if(c == '>' || c == '<' || c == '=' || c == '#')
            return 3;
        if(c == '.')
            return 2;
        if(c == '|')
            return 1;
        return 0;
    }


    public static boolean checkParenthesis(String s){
        int stack = 0;
        for (int i = 0; i < s.length(); ++i) {
            char c = s.charAt(i);
            if(c == '(')
                ++stack;
            else if(c == ')'){
                --stack;
                if(stack < 0)
                    return false;
            }
        }
        return stack == 0;
    }

    public static String parseInfixToPostfix(String s) {
        StringBuilder out = new StringBuilder("");
        Stack<Character> stack = new Stack<>();

        for (int i = 0; i < s.length(); ++i) {
            char c = s.charAt(i);

            if (isOperand(c)) {
                out.append(c);
                continue;
            }

            if (c == '(') {
                stack.add(c);
                continue;
            }

            if (c == ')') {
                char ch = stack.pop();

                //create a temporary stringbuilder to check if the expression has an empty parenthesis ()

                StringBuilder temp = new StringBuilder("");
                while (ch != '(') {
                    temp.append(ch);
                    ch = stack.pop();
                }

                if(temp.length() == 0) //empty parenthesis
                    return ""; //will be invalidated by the postfix checker

                out.append(temp.toString());
                continue;
            }

            //here are only operators
            if(stack.empty()){
               stack.push(c);
            }

            else{
                while(!stack.empty() && ( getThePrecedence(stack.peek()) >= getThePrecedence(c) ))
                    out.append(stack.pop());
                stack.push(c);
            }
        }

        while(!stack.empty())
            out.append(stack.pop());

        return out.toString();
    }

    public static boolean postFixValidate(String s){
        if(s.equals(""))
            return false;

        int stack = 0;
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if(c == '(' || c == ')')
                continue;

            if(isOperand(c)){
                ++stack;
            }
            else {
                stack -= 2;
                if(stack < 0)
                    return false;
                ++stack;
            }
        }
        return stack == 1;
    }

    public static boolean lexicalAnalysis(String s){
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if(isOperand(c) == false && getThePrecedence(c) == 0 && c != '(' && c != ')')
                return false;
        }
        return true;
    }

    public static void main(String[] args) throws IOException {

        BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
        PrintStream out = new PrintStream(System.out);

        while (true) {
            String s = in.readLine();

            if (lexicalAnalysis(s)) {
                if(checkParenthesis(s)){
                    String postfix = parseInfixToPostfix(s);
                    if(postFixValidate(postfix)){
                        System.out.println(postfix);
                    }
                    else{
                        out.println("Syntax Error!");
                    }
                }
                else{
                    out.println("Syntax Error!");
                }
            } else {
                out.println("Lexical Error!");
            }
        }
    }
}

Upvotes: 1

Views: 7679

Answers (1)

Patrick87
Patrick87

Reputation: 28292

The way you do this is to build a parse tree. Nodes will be operators and leaves will be terminal symbols (numbers or letters). If you get into a situation where you're trying to put children under a terminal symbol, or where some operands don't have two children, you're done. Consider "(ab+c)/2(e+a)". We are parsing infix, so we know the first thing we will read is the LHS of some operation. We see a left parenthesis, so we recurse. We know the first thing we will see is the LHS of some operation, and we see 'a'. We now expect to see an operator, but see 'b' instead, and this is an error since 'a' cannot be the left child of 'b' (since both are terminal symbols). This errors out.

Suppose the 'b' weren't there, and we see '+' instead. Then all's well, and we expect the RHS next. We see 'c', all good, then the right parenthesis cues us to end the recursion there. Our tree looks like:

    ?
   / \
 add   ?
 / \
a   b

Having seen the LHS, we now see '/'. All good. Our tree is now

    divide
   /      \
 add       ?
 / \
a   b

Next we see a 2, all good, but then we've seen all we need and expect to see nothing else. Seeing a right parenthesis cues us to understand that building the tree has failed.

Suppose the 2 had not been present. We see the left parenthesis and recurse. We see 'e' as the LHS, '+' as the operator, 'a' as the RHS and then the right parenthesis, ending the recursion. Our tree now looks like

    divide
   /      \
 add      add
 / \      / \
a   b    e   a

This is a valid tree, and we can get the postfix expression by doing a preorder traversal of the tree: we get '/+ab+ea'. The basic algorithm preorder traversal is:

preorder(node)
    print node.value
    preorder(node.left)
    preorder(node.right)

Note that I don't address order of operations in the answer so far. You can avoid the problem altogether by first fully parenthesizing your infix expression; this simplifies the rest. Basically, just start finding operators in whatever order you like, then ensure that there are parentheses around the (LHS op RHS) triple. Remove the parentheses from the outermost operation and then you're good to go.

Upvotes: 2

Related Questions