Serban
Serban

Reputation: 11

Java Antlr4 first order logic grammar

I want to parse strings in first order logic and turn them into a specific class structure. For instance, I want to parse a formula such as

∀x∃y∃z((R(x,y) ∨ Px)→(Qx→(Px∧Zx))) 

and turn it into a Universal class which has a Variable field and a Formula quantifiedFormula field which stands for the rest of the formula. I have troubles with the grammar, though. When I parse that formula with the antlr generated code I get

line 1:11 extraneous input '(' expecting {'\u2200', '\u2203', '\u00ac'}

'\u2200' is ∀, \u2203 is ∃ and \u00ac is ¬, the negation sign.

This is my grammar file. I put it together following the FOL.g file found on the antlr3 site. I am using antl4, however.

grammar FOL;

options{
    language=Java;
    output=AST;
    ASTLabelType = CommonTree;
    backtrack=true;
}

tokens{
    PREDICATE,
    FUNCTION
}

/*------------------------------------------------------------------
* PARSER RULES
*------------------------------------------------------------------*/

condition: formula EOF ;

formula 
:     (forall | exists)* bidir ;
forall  :   FORALL VARIABLE ;
exists  :   EXISTS VARIABLE ;

bidir   :   implication (BIDIR implication)*;

implication
:   disjunction (IMPL disjunction)*;

disjunction
:   conjunction (OR conjunction)* ;

conjunction
:   negation (AND negation)* ;

negation 
:   NOT (predicate | LPAREN* formula RPAREN*) ;

predicate 
:   PREPOSITION predicateTuple (PREDICATE PREPOSITION predicateTuple)
|   PREPOSITION ;

predicateTuple
:   LPAREN term (',' term)* RPAREN ;

term    :   function | VARIABLE ;

 function:  CONSTANT functionTuple (FUNCTION CONSTANT functionTuple)
|   CONSTANT;

functionTuple
    :   LPAREN (CONSTANT | VARIABLE) (',' (CONSTANT | VARIABLE) )* RPAREN;

/*------------------------------------------------------------------
* LEXER RULES
*------------------------------------------------------------------*/
LPAREN: '(';
RPAREN: ')';
FORALL: '\u2200';
EXISTS: '\u2203';
NOT:'\u00ac';
IMPL: '\u2192';
BIDIR: '\u2194';
OR: '\u2228';
AND: '\u2227';
EQ: '=';


VARIABLE:  (('q'..'z') ) CHARACTER* ;

CONSTANT: (('a'..'p') ) CHARACTER* ;

PREPOSITION: ('A'..'Z') CHARACTER* ;

fragment CHARACTER: ('a'..'z' | 'A'..'Z' | '_') ;

WS : (' ' | '\t' | '\r' | '\n')+ -> skip  ;

Upvotes: 1

Views: 702

Answers (1)

rici
rici

Reputation: 241721

That seems unsurprising.

According to your grammar, a formula is some number of exists and forall clauses followed by a bidir. If you trace through the productions for bidir, it becomes clear that it must start with a negation and that, in turn, must start with NOT. So while you scan the formula, you must see clauses headed by one of the three tokens EXISTS, FORALL or NOT.

Your negation needs to include the possibility that it is not a negation. You could, for example, make NOT optional.

Upvotes: 4

Related Questions