Luther Baker
Luther Baker

Reputation: 7341

What is wrong with my ANTLR grammar for parsing a simplistic Java file?

ANTL grammar:

grammar Java;

// Parser

compilationUnit: classDeclaration;

classDeclaration : 'class' CLASS_NAME classBlock
  ;

classBlock: OPEN_BLOCK method* CLOSE_BLOCK
  ;

method: methodReturnValue methodName methodArgs methodBlock
  ;

methodReturnValue: CLASS_NAME
  ;

methodName: METHOD_NAME
  ;

methodArgs: OPEN_PAREN CLOSE_PAREN
  ;

methodBlock: OPEN_BLOCK CLOSE_BLOCK
  ;

// Lexer

CLASS_NAME: ALPHA;
METHOD_NAME: ALPHA;

WS: [ \t\n] -> skip;

OPEN_BLOCK: '{';
CLOSE_BLOCK: '}';

OPEN_PAREN: '(';
CLOSE_PAREN: ')';

fragment ALPHA: [a-zA-Z][a-zA-Z0-9]*;

Pseudo-Java file:

class Test {

    void run() { }

}

Most things match up except for METHOD_NAME which it errantly associates with methodArgs.

line 3:6 mismatched input 'run' expecting METHOD_NAME

methodName

Upvotes: 1

Views: 86

Answers (1)

BernardK
BernardK

Reputation: 3744

This is about token ambiguity. This question has been asked several times these last weeks. Follow the links, especially disambiguate, in this answer.

As soon as you have a mismatched error, add -tokens to grun to display the tokens, it helps finding the discrepancy between what you THINK the lexer will do and what it actually DOES. With your grammar :

CLASS_NAME: ALPHA;
METHOD_NAME: ALPHA;

every input matched by ALPHA is ambiguous, and in case of ambiguity ANTLR chooses the first rule.

$ grun Question compilationUnit -tokens -diagnostics t.text 
[@0,0:4='class',<'class'>,1:0]
[@1,6:9='Test',<CLASS_NAME>,1:6]
[@2,11:11='{',<'{'>,1:11]
[@3,18:21='void',<CLASS_NAME>,3:4]
[@4,23:25='run',<CLASS_NAME>,3:9]
[@5,26:26='(',<'('>,3:12]
[@6,27:27=')',<')'>,3:13]
[@7,29:29='{',<'{'>,3:15]
[@8,31:31='}',<'}'>,3:17]
[@9,34:34='}',<'}'>,5:0]
[@10,36:35='<EOF>',<EOF>,6:0]
Question last update 0841
line 3:9 mismatched input 'run' expecting METHOD_NAME

because run has been interpreted as a CLASS_NAME.

I would write the grammar like so :

grammar Question;

// Parser

compilationUnit
@init {System.out.println("Question last update 0919");}
    : classDeclaration;

classDeclaration : 'class' ID classBlock
  ;

classBlock: OPEN_BLOCK method* CLOSE_BLOCK
  ;

method: methodReturnValue=ID methodName=ID methodArgs methodBlock
        {System.out.println("Method found : " + $methodName.text + 
                            " which returns a " + $methodReturnValue.text);}
  ;

methodArgs: OPEN_PAREN CLOSE_PAREN
  ;

methodBlock: OPEN_BLOCK CLOSE_BLOCK
  ;

// Lexer

ID : ALPHA ( ALPHA | DIGIT | '_' )* ;

WS: [ \t\n] -> skip;

OPEN_BLOCK: '{';
CLOSE_BLOCK: '}';

OPEN_PAREN: '(';
CLOSE_PAREN: ')';

fragment ALPHA : [a-zA-Z] ;
fragment DIGIT : [0-9] ;

Execution :

$ grun Question compilationUnit -tokens -diagnostics t.text 
[@0,0:4='class',<'class'>,1:0]
[@1,6:9='Test',<ID>,1:6]
[@2,11:11='{',<'{'>,1:11]
[@3,18:21='void',<ID>,3:4]
[@4,23:25='run',<ID>,3:9]
[@5,26:26='(',<'('>,3:12]
[@6,27:27=')',<')'>,3:13]
[@7,29:29='{',<'{'>,3:15]
[@8,31:31='}',<'}'>,3:17]
[@9,34:34='}',<'}'>,5:0]
[@10,36:35='<EOF>',<EOF>,6:0]
Question last update 0919
Method found : run which returns a void

and $ grun Question compilationUnit -gui t.text : enter image description here

methodReturnValue and methodName are available in the listener from ctx, the rule context.

Upvotes: 1

Related Questions