Tengyu Liu
Tengyu Liu

Reputation: 1253

ANTLR4: Unexpected behavior that I can't understand

I'm very new to ANTLR4 and am trying to build my own language. So my grammar starts at

program: <EOF> | statement | functionDef | statement program | functionDef program;

and my statement is

statement: selectionStatement | compoundStatement | ...;

and

selectionStatement
:   If LeftParen expression RightParen compoundStatement (Else compoundStatement)?
|   Switch LeftParen expression RightParen compoundStatement
;

compoundStatement
: LeftBrace statement* RightBrace;

Now the problem is, that when I test a piece of code against selectionStatement or statement it passes the test, but when I test it against program it fails to recognize. Can anyone help me on this? Thank you very much


edit: the code I use to test is the following:

if (x == 2) {}

It passes the test against selectionStatement and statement but fails at program. It appears that program only accepts if...else

if (x == 2) {} else {}

Edit 2: The error message I received was

<unknown>: Incorrect error: no viable alternative at input 'if(x==2){}'

Upvotes: 0

Views: 1010

Answers (1)

GRosenberg
GRosenberg

Reputation: 5991

Cannot answer your question given the incomplete information provided: the statement rule is partial and the compoundStatement rule is missing.

Nonetheless, there are two techniques you should be using to answer this kind of question yourself (in addition to unit tests).

First, ensure that the lexer is working as expected. This answer shows how to dump the token stream directly.

Second, use a custom ErrorListener to provide a meaningful/detailed description of its parse path to every encountered error. An example:

public class JavaErrorListener extends BaseErrorListener {

    public int lastError = -1;

    @Override
    public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine,
            String msg, RecognitionException e) {

        Parser parser = (Parser) recognizer;
        String name = parser.getSourceName();
        TokenStream tokens = parser.getInputStream();

        Token offSymbol = (Token) offendingSymbol;
        int thisError = offSymbol.getTokenIndex();
        if (offSymbol.getType() == -1 && thisError == tokens.size() - 1) {
            Log.debug(this, name + ": Incorrect error: " + msg);
            return;
        }
        String offSymName = JavaLexer.VOCABULARY.getSymbolicName(offSymbol.getType());

        List<String> stack = parser.getRuleInvocationStack();
        // Collections.reverse(stack);

        Log.error(this, name);
        Log.error(this, "Rule stack: " + stack);
        Log.error(this, "At line " + line + ":" + charPositionInLine + " at " + offSymName + ": " + msg);

        if (thisError > lastError + 10) {
            lastError = thisError - 10;
        }
        for (int idx = lastError + 1; idx <= thisError; idx++) {
            Token token = tokens.get(idx);
            if (token.getChannel() != Token.HIDDEN_CHANNEL) Log.error(this, token.toString());
        }
        lastError = thisError;
    }
}

Note: adjust the Log statements to whatever logging package you are using.

Finally, Antlr doesn't do 'weird' things - just things that you don't understand.

Upvotes: 2

Related Questions