Reputation: 13912
Given the gramar
grammar T;
options
{
k=4;
language=CSharp3;
TokenLabelType=CommonToken;
output=AST;
ASTLabelType=CommonTree;
}
tokens
{
LPAREN = '(';
RPAREN = ')';
LBRACK = '{';
RBRACK = '}';
}
fragment
ID : ('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
WS : (' ' | '\t' | '\n' |'\r' )+ { $channel = Hidden; } ;
public program: CLASSDEF+ EOF! ;
CLASSDEF: 'class' ID LBRACK
RBRACK ;
This produces a lexer and a parser which I use as follows
using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
namespace compiler
{
internal class Program2
{
public static void Main(string[]arg)
{
ANTLRStringStream Input = new ANTLRStringStream(@"class foo
{
}");
TLexer lex = new TLexer(Input);
Console.WriteLine("errors:" + lex.NumberOfSyntaxErrors);
CommonTokenStream tokens = new CommonTokenStream(lex);
TParser parser = new TParser(tokens);
var parsed = parser.program();
Console.WriteLine("errors: " + parser.NumberOfSyntaxErrors);
CommonTree tree = parsed.Tree;
Console.WriteLine("type:" + tree.Type);
Console.WriteLine("text:" + tree.Text);
Console.WriteLine("children:" +tree.ChildCount);
Console.WriteLine(tree.ToString());
Console.WriteLine(tree.ToStringTree());
Console.ReadKey();
}
}
}
When running this code I get 0 lex errors and 1 parse error
result
errors:0
errors: 1
type:0
text:{
}
children:0
<error: {
}>
<error: {
}>
Questions!
I thought ANTLR was supposed to give intelligent error messages, yet I fail to find out whats wrong
Am I missing code to improve on the error messages?
Upvotes: 1
Views: 1005
Reputation: 170308
You made CLASSDEF
a lexer rule (in other words: a single token), which is incorrect. When the lexer stumbles upon input like "class X"
, it cannot create a CLASSDEF
token because there's a space between "class"
and "X"
(and no, the WS
token will not help you with this since CLASSDEF
is a lexer rule).
In other words: make CLASSDEF
a parser rule instead (and remove fragment
from ID
!):
grammar T;
options
{
language=CSharp3;
output=AST;
}
tokens
{
CLASS = 'class';
LPAREN = '(';
RPAREN = ')';
LBRACK = '{';
RBRACK = '}';
}
public program
: class_def+ EOF!
;
class_def
: CLASS ID LBRACK RBRACK
;
ID
: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
WS
: (' ' | '\t' | '\n' |'\r' )+ { $channel = Hidden; }
;
Now parsing input like "class foo { }"
will produce the following parse:
Upvotes: 3