Tomás Juárez
Tomás Juárez

Reputation: 1514

Parsing predicate logic in bison c++

I am developing a software to simplify a formula in predicate logic notation, applying some logical laws to output the formula in a conjunctive normal form. I will try to explain everything as clearly as possible. I know that this is not a mathematic site, but I have to explain a bit some concepts to get concret answers.


What will the software do?

The software will have an entry in which the user can insert a formula and the program will process until reach its conjunctive normal form. So I have to do a parser that can handle the input and put each token in some structure an then thansform it as needed. I will use the following tools:

How to get the conjunctive normal form of a formula?

So, basically is the same formula but separated by the && token.

Construction status

The first problem that I had when I was trying to manipulate the parts of the input formula was handle the recursion my grammar rules.

Suppose the following rules:

expr:
         SYMBOL                                  { cout << " 0 "; }
    |    LEFT_PARENTHESES expr RIGHT_PARENTESES  { cout << " 1 "; }
    |    expr AND expr                           { cout << " 2 "; }
;

and the input (a && b) then the program prints: 0 - 0 - 1 - 2, this is: from the last rule executed to the first. I've realize that using a queue and inserting each token with the push_back() method, then I can get all my tokens ordered by these rules. But I want to operate with these expressions, not only verify her syntax and print the tokens, so I had to create a more complex structure (not so complex), a binary tree to simulate an abstract syntax tree and then transform it. Suppose now the formula a && b || c, represented in my tree it will be:

      &&
     /  \
    a    ||
        /  \
       b    c

this is, the parent is always an operator, and the childs are the operands or some expressions. If I have an expression like a && (b || c), then I have to include the parentheses:

      &&
     /  \
    a    (
          \
          ||
         /  \
        b    c

in that way, all the tokens in the right side of ( will be enclosed in one group. So, I can do it with simple formulas(binary or multiple operations with the not, and, || and () operators; I can resolve nested formulas too and apply De Morgan's laws) and this works very good :D.

So, whats the problem?

Ok, as I say above, it only works with simple formulas. But what happen if the input have the operators -> (implies) or <-> (if and only if)? (...) the program does't work. All I have to do is apply the definitions of that logic operators:

So, the first thing I have to do is to transform the -> and <-> operations using the rules (1) and (2). Once I have all the expressions with the elementals operators (&&, ||, ! and ()), I have to distribute the ! operators using De Morgan's laws and distibute the || to returns its conjunctive normal form. It sounds simple to do, but it is not (at least not for me).

Do you remember when I mentioned that the rules are not executed in the order of arrival of tokens?. In this case, I can apply the -> operator: let a->b adding a ! in the left expression (a) and inserting || between these two expressions (a and b). I can do this adding a parent in my tree (!), a right child a, a || parent and another child b:

      ||
     /  \
    !    b
     \   
      a  

So, in this case works, but I can't do the same with the <-> operator!

So, my question is: how I can solve these problems?, do you recommend using another structure to store tokens?, do you know any techniques to do this?


The code

parser.y

%{
    #include <iostream>
    #include "parsingtree.h"

    using namespace std;

    #define YYSTYPE string

    int yylex(void);
    void yyerror(char *);

    ParsingTree tree;
%}

%token  LEFT_PAR
        RIGHT_PAR
        SYMBOL
        END
        NOT
        DIST

%left   AND
        OR
        THEN
        IFF

%start input

%%

input:
        /* empty */
    |   input expr
;

expr:
        SYMBOL {
            $$ = $1;
            cout << "ACA ENTRO A LA REGLA 0" << endl;
            tree.pushSymbol($1);
        }
    |   LEFT_PAR expr RIGHT_PAR {
            $$ = $2;
            tree.pushLogicOperator("()");
        }
    |   NOT expr {
            $$ = $2;
            tree.pushLogicOperator("!");
        }
    |   DIST expr RIGHT_PAR {
            $$ = $2;
            tree.pushLogicOperator("!");
        }
    |   expr AND expr {
            tree.pushLogicOperator("&&");
        }
    |   expr OR expr {
            tree.pushLogicOperator("||");
        }
    |   expr THEN expr {
            tree.pushLogicOperator("||");
            tree.pushLogicOperator("!");
        }
;

            tree.pushLogicOperator("||");
            tree.pushLogicOperator("!");

        }
    |   SYMBOL THEN expr {
            tree.pushSymbol($1);
            tree.pushLogicOperator("||");
            tree.pushLogicOperator("!");
        }
%%

void yyerror(char *s) {

}

scanner.l

%option noyywrap
%{
    #include <iostream>
    #include "parser.tab.c"
    using namespace std;
%}

%%

[a-zA-Z0-9<>=]+  {
    yylval = strdup(yytext);
    return SYMBOL;
}

"&&" {
    return AND;
}

"||" {
    return OR;
}

"!" {
    return NOT;
}

"!(" {
    return DIST;
}

[ \0\0] {
    return END;
}

"("     {
    return LEFT_PAR;
}

")"     {
    return RIGHT_PAR;
}

"->"    {
    return THEN;
}

"<->"   {
    return IFF;
}

%%

main.cpp

#include "parsingtree.h"
#include "lex.yy.c"

typedef yy_buffer_state *YY_BUFFER_STATE;
extern int yyparse();
extern YY_BUFFER_STATE yy_scan_string(char *, size_t);

int main(int argc, char *argv[]) {
    yy_scan_string("(a&&(d->g))->(b&&c)\0\0");
    yyparse();

    tree.printTree();
}

parsingtree.h

#ifndef PARSINGTREE_H
#define PARSINGTREE_H
#include <QString>
#include <QList>
#include <QDebug>
#include <iostream>

using namespace std;

class ParsingTree
{
public:
    ParsingTree();
    void pushSymbol(string data);
    void pushLogicOperator(string data);
    void printTree();

private:
    struct treeNode
    {
        string data;
        treeNode* leftNode;
        treeNode* rightNode;
    };
    QList<treeNode*> auxList; //guarda el arbol en formación
    treeNode* tree; //pedir el ultimo elemento de la lista, ese es el arbol

    void printTree(treeNode* rTree);
    string latest(treeNode* rTree);
};

#endif // PARSINGTREE_H

parsingtree.cpp

#include "parsingtree.h"

ParsingTree::ParsingTree()
{
    tree= NULL;
}

void ParsingTree::pushSymbol(string data)
{
    treeNode* node = new treeNode;
    node->data = data;
    node->leftNode = NULL;
    node->rightNode= NULL;
    auxList.push_back(node);
}

void ParsingTree::pushLogicOperator(string data)
{
    //binarios
    if ((data == "||") || (data == "&&"))
    {
         treeNode* node = new treeNode;
         node->data = data;
         node->rightNode=auxList.last();
         auxList.removeLast();
         node->leftNode=auxList.last();
         auxList.removeLast();
         auxList.push_back(node);
    }
    //unarios
    if ((data == "!") || (data == "()"))
    {
        treeNode* node = new treeNode;
        node->data = data;
        node->rightNode=auxList.last();
        auxList.removeLast();
        node->leftNode= NULL;
        auxList.push_back(node);
    }
}

void ParsingTree::printTree()
{
    tree = auxList.last();
    printTree(tree);
}

void ParsingTree::printTree(ParsingTree::treeNode* rTree)
{
    if (rTree != NULL)
    {

        printTree(rTree->leftNode);
        if (rTree->data == "()")
        {
            cout << "$(";
            printTree(rTree->rightNode);
            cout << ")$";
        }
        else
        {
            cout << rTree->data + " ";
            printTree(rTree->rightNode);
        }
    }
}

Thanks! :D

PS: if I wrote something wrong or there is something that you can not understand, please ask me so I can express myself better; My English is very poor (I'm from Argentina) and I find this problem very difficult to explain, so I hope you've understood.

Upvotes: 0

Views: 1448

Answers (1)

ShellFish
ShellFish

Reputation: 4551

Use this pseudocode:

P and Q are atomic OR expr between last ')' and corresponding '('

while '<=>' occurs
    P <=> Q |= (P => Q) && (Q => P)

while '=>' occurs
    P => Q |= !P || Q

while '!(P || Q)' || '!(P && Q)' occurs
    apply De Morgan

while 'P || (Q && R)' || '(Q & R) || P' occurs
    |= (P || Q) && (P || R)

apply these rules after each wile:
while 'P || (Q || R)' || '(P || Q) || R' occurs
    |= P || Q || R
while 'P && (Q && R)' || '(P && Q) && R' occurs
    |= P && Q && R

You can implement using regex replacement or parsing the string using c++. Be sure to parse from left to right or right to left in a consistent manner. This will determine whether the logic is left or right associative.

Please let me know if you find an ambiguity in this replacement scheme.

Upvotes: 1

Related Questions