Valentin Montmirail
Valentin Montmirail

Reputation: 2640

Change the parsing language

I'm using a modal-SAT solver. This solver is unfortunately using Flex and Bison, both languages that I don't master...

I wanted to change one syntax to another, but I've got some issue to do it, even after tutorials about Flex-Lexer and Bison.

So here is the problem :

I want to be able to parse such modal logic formulas :

a Logical problem in Modal logics as example

In the previous notation, such formulas were written like this :

(NOT (IMP (AND (ALL R0 (IMP C0 C1)) (ALL R0 C0)) (ALL R0 C1)))

And here are the Flex/Bisons file used to parse them :

alc.y

%{

#include "fnode.h"

#define YYMAXDEPTH 1000000

fnode_t *formula_as_tree;

%} 

%union {
   int     l;
   int     i;
   fnode_t *f;  
}

/* Tokens and types */

%token LP RP
%token ALL SOME
%token AND IMP OR IFF NOT
%token TOP BOT
%token RULE CONC
%token <l> NUM 

%type <f> formula 
%type <f> boolean_expression rule_expression atomic_expression
%type <f> other
%type <i> uboolop bboolop nboolop ruleop
%type <l> rule 

%% /* Grammar rules */

input: formula {formula_as_tree = $1;}       
;

formula: boolean_expression {$$ = $1;}
| rule_expression {$$ = $1;}
| atomic_expression {$$ = $1;}
;

boolean_expression: LP uboolop formula RP 
{$$ = Make_formula_nary($2,empty_code,$3);}

| LP bboolop formula formula RP 
{$$ = Make_formula_nary($2,empty_code, Make_operand_nary($3,$4));}

| LP nboolop formula other RP 
{$$ = Make_formula_nary($2,empty_code,Make_operand_nary($3,$4));}
;

rule_expression: LP ruleop rule formula RP {$$ = Make_formula_nary($2,$3,$4);}
;

atomic_expression: CONC NUM {$$ = Make_formula_nary(atom_code,$2,Make_empty());}
| TOP {$$ = Make_formula_nary(top_code,empty_code,Make_empty());}
| BOT {$$ = Make_formula_nary(bot_code,empty_code,Make_empty());}
;

other: formula other {$$ = Make_operand_nary($1,$2);}
| {$$ = Make_empty();}
;

uboolop: NOT {$$ = not_code;}  
;

bboolop: IFF {$$ = iff_code;} 
|        IMP {$$ = imp_code;}
;

nboolop: AND {$$ = and_code;} 
|        OR  {$$ = or_code;} 
;

ruleop: SOME {$$ = dia_code;}
| ALL {$$ = box_code;}       

rule: RULE NUM {$$ = $2;}
;

%% /* End of grammar rules */

int yyerror(char *s)  
{
  printf("%s\n", s);
  exit(0);
}

alc.lex

%{
#include <stdio.h>

#include "fnode.h"
#include "y.tab.h"

int number;
%}

%%

[ \n\t] ;    
"("      return LP;
")"      return RP; 
"ALL"    return ALL;         
"SOME"   return SOME;
"AND"    return AND; 
"IMP"    return IMP;
"OR"     return OR;
"IFF"    return IFF;
"NOT"    return NOT;
"TOP"    return TOP;
"BOTTOM" return BOT;
"R"      return RULE;
"C"      return CONC;

0|[1-9][0-9]* {   
  sscanf(yytext,"%d",&number);
  yylval.l=number;
  return NUM;
}

.  { 
  /* Error function */
  fprintf(stderr,"Illegal character\n");
  return -1;
}
%%

Now, let's write our example but in the new syntax that I want to use :

begin
(([r0](~pO | p1) & [r0]p0) | [r0]p1) 
end

Major problems for me that are blocking me to parse this new syntax correctly is :

Here is the small list of what are the new symbol, even if I don't know how to parse them :

"("      return LP;
")"      return RP; 
"[]"    return ALL;         
"<>"   return SOME;
"&"    return AND; 
"IMP"    return IMP;
"|"     return OR;
"IFF"    return IFF;
"~"    return NOT;
"true"    return TOP;
"false" return BOT;
"r"      return RULE;
"p"      return CONC;

I assume that only theses 2 files will change, Because it should still be able to read the previous syntaxe, by compiling the source code with other .y and .lex

But I'm asking your help to know exactly how to write it down :/

Thanks in advance !

Upvotes: 2

Views: 102

Answers (2)

Valentin Montmirail
Valentin Montmirail

Reputation: 2640

For someone who would have the exact same problem (I assume that this problem is quite rare :) )

With the good vocabulary, it's much easier to google the problem and find a solution.

The first notation

(NOT (IMP (AND (ALL R0 (IMP C0 C1)) (ALL R0 C0)) (ALL R0 C1)))

is the ALC format.

The other notation

begin (([r0](~pO | p1) & [r0]p0) | [r0]p1) end

is the InToHyLo format.

And there is a tool called the formula translation tool ("ftt") developed and bundled with Spartacus (http://www.ps.uni-saarland.de/spartacus/). It can translate between all the formats of provers.

Using this tool is a little hack who avoid dealing with the Flex/Bison languages.

One just needs to translate one problem to another, problems will be equivalent and it's very fast to translate.

Upvotes: 0

Axel Kemper
Axel Kemper

Reputation: 11322

Tommi Junttila's BC Package implements a language for Boolean expressions and circuits using Bison and Flex.

To study the source files won't fully replace going through a proper Bison/Flex tutorial, but it certainly should give you a good start.

Upvotes: 1

Related Questions