Reputation: 933
I’m currently trying to build a parser for the language Oberon using Antlr and Ecplise.
This is what I have got so far:
grammar oberon;
options
{
language = Java;
//backtrack = true;
output = AST;
}
@parser::header {package dhbw.Oberon;}
@lexer::header {package dhbw.Oberon; }
T_ARRAY : 'ARRAY' ;
T_BEGIN : 'BEGIN';
T_CASE : 'CASE' ;
T_CONST : 'CONST' ;
T_DO : 'DO' ;
T_ELSE : 'ELSE' ;
T_ELSIF : 'ELSIF' ;
T_END : 'END' ;
T_EXIT : 'EXIT' ;
T_IF : 'IF' ;
T_IMPORT : 'IMPORT' ;
T_LOOP : 'LOOP' ;
T_MODULE : 'MODULE' ;
T_NIL : 'NIL' ;
T_OF : 'OF' ;
T_POINTER : 'POINTER' ;
T_PROCEDURE : 'PROCEDURE' ;
T_RECORD : 'RECORD' ;
T_REPEAT : 'REPEAT' ;
T_RETURN : 'RETURN';
T_THEN : 'THEN' ;
T_TO : 'TO' ;
T_TYPE : 'TYPE' ;
T_UNTIL : 'UNTIL' ;
T_VAR : 'VAR' ;
T_WHILE : 'WHILE' ;
T_WITH : 'WITH' ;
module : T_MODULE ID SEMI importlist? declarationsequence?
(T_BEGIN statementsequence)? T_END ID PERIOD ;
importlist : T_IMPORT importitem (COMMA importitem)* SEMI ;
importitem : ID (ASSIGN ID)? ;
declarationsequence :
( T_CONST (constantdeclaration SEMI)*
| T_TYPE (typedeclaration SEMI)*
| T_VAR (variabledeclaration SEMI)*)
(proceduredeclaration SEMI | forwarddeclaration SEMI)*
;
constantdeclaration: identifierdef EQUAL expression ;
identifierdef: ID MULT? ;
expression: simpleexpression (relation simpleexpression)? ;
simpleexpression : (PLUS|MINUS)? term (addoperator term)* ;
term: factor (muloperator factor)* ;
factor: number
| stringliteral
| T_NIL
| set
| designator '(' explist? ')'
;
number: INT | HEX ; // TODO add real
stringliteral : '"' ( ~('\\'|'"') )* '"' ;
set: '{' elementlist? '}' ;
elementlist: element (COMMA element)* ;
element: expression (RANGESEP expression)? ;
designator: qualidentifier
('.' ID
| '[' explist ']'
| '(' qualidentifier ')'
| UPCHAR )+
;
explist: expression (COMMA expression)* ;
actualparameters: '(' explist? ')' ;
muloperator: MULT | DIV | MOD | ET ;
addoperator: PLUS | MINUS | OR ;
relation: EQUAL ; // TODO
typedeclaration: ID EQUAL type ;
type: qualidentifier
| arraytype
| recordtype
| pointertype
| proceduretype
;
qualidentifier: (ID '.')* ID ;
arraytype: T_ARRAY expression (',' expression) T_OF type;
recordtype: T_RECORD ('(' qualidentifier ')')? fieldlistsequence T_END ;
fieldlistsequence: fieldlist (SEMI fieldlist) ;
fieldlist: (identifierlist COLON type)? ;
identifierlist: identifierdef (COMMA identifierdef)* ;
pointertype: T_POINTER T_TO type ;
proceduretype: T_PROCEDURE formalparameters? ;
variabledeclaration: identifierlist COLON type ;
proceduredeclaration: procedureheading SEMI procedurebody ID ;
procedureheading: T_PROCEDURE MULT? identifierdef formalparameters? ;
formalparameters: '(' params? ')' (COLON qualidentifier)? ;
params: fpsection (SEMI fpsection)* ;
fpsection: T_VAR? idlist COLON formaltype ;
idlist: ID (COMMA ID)* ;
formaltype: (T_ARRAY T_OF)* (qualidentifier | proceduretype);
procedurebody: declarationsequence (T_BEGIN statementsequence)? T_END ;
forwarddeclaration: T_PROCEDURE UPCHAR? ID MULT? formalparameters? ;
statementsequence: statement (SEMI statement)* ;
statement : assignment
| procedurecall
| ifstatement
| casestatement
| whilestatement
| repeatstatement
| loopstatement
| withstatement
| T_EXIT
| T_RETURN expression?
;
assignment: designator ASSIGN expression ;
procedurecall: designator actualparameters? ;
ifstatement: T_IF expression T_THEN statementsequence
(T_ELSIF expression T_THEN statementsequence)*
(T_ELSE statementsequence)? T_END ;
casestatement: T_CASE expression T_OF caseitem ('|' caseitem)*
(T_ELSE statementsequence)? T_END ;
caseitem: caselabellist COLON statementsequence ;
caselabellist: caselabels (COMMA caselabels)* ;
caselabels: expression (RANGESEP expression)? ;
whilestatement: T_WHILE expression T_DO statementsequence T_END ;
repeatstatement: T_REPEAT statementsequence T_UNTIL expression ;
loopstatement: T_LOOP statementsequence T_END ;
withstatement: T_WITH qualidentifier COLON qualidentifier T_DO statementsequence T_END ;
ID : ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ;
fragment DIGIT : '0'..'9' ;
INT : ('-')?DIGIT+ ;
fragment HEXDIGIT : '0'..'9'|'A'..'F' ;
HEX : HEXDIGIT+ 'H' ;
ASSIGN : ':=' ;
COLON : ':' ;
COMMA : ',' ;
DIV : '/' ;
EQUAL : '=' ;
ET : '&' ;
MINUS : '-' ;
MOD : '%' ;
MULT : '*' ;
OR : '|' ;
PERIOD : '.' ;
PLUS : '+' ;
RANGESEP : '..' ;
SEMI : ';' ;
UPCHAR : '^' ;
WS : ( ' ' | '\t' | '\r' | '\n'){skip();};
My problem is when I check the grammar I get the following error and just can’t find an appropriate way to fix this:
rule statement has non-LL(*) decision
due to recursive rule invocations reachable from alts 1,2.
Resolve by left-factoring or using syntactic predicates
or using backtrack=true option.
|---> statement : assignment
Also I have the problem with declarationsequence and simpleexpression.
When I use options { … backtrack = true; … }
it at least compiles, but obviously doesn’t work right anymore when I run a test-file, but I can’t find a way to resolve the left-recursion on my own (or maybe I’m just too blind at the moment because I’ve looked at this for far too long now). Any ideas how I could change the lines where the errors occurs to make it work?
EDIT
I could fix one of the three mistakes. statement
works now. The problem was that assignment
and procedurecall
both started with designator
.
statement : procedureassignmentcall
| ifstatement
| casestatement
| whilestatement
| repeatstatement
| loopstatement
| withstatement
| T_EXIT
| T_RETURN expression?
;
procedureassignmentcall : (designator ASSIGN)=> assignment | procedurecall;
assignment: designator ASSIGN expression ;
procedurecall: designator actualparameters? ;
Upvotes: 1
Views: 165