lilott8
lilott8

Reputation: 1116

JavaCC newlines might be causing issues with parsing

I have a syntax that follows:

PARSER_BEGIN(Parser)
    package parser;
    public class Parser {}
PARSER_END(Parser)

SKIP: {
    " "
    |    "\t"
    |    "\n"
    |    "\r"
    |    "\f"
}

TOKEN : {
    <MIX: "mix">
    |    <WITH: "with">
    |    <FUNCTION: "function">
    |    <MANIFEST: "manifest">
    |    <REPEAT: "repeat">
    |    <NAT: "nat">
    |    <REAL: "real">
    |    <MAT: "mat">
    |    <FOR: "for">
    |    <INSTRUCTIONS: "instructions">
}

TOKEN : {
    <LPAREN: "(">
    |    <RPAREN: ")">
    |    <LBRACE: "{">
    |    <RBRACE: "}">
    |    <COLON: ":">
    |    <ASSIGN: "=">
}

// Include the necessary <INTEGER_LITERAL> included in most examples

TOKEN : {
    <IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)*>
    // Letter and Digit are the unicode values.
}

void Program() :
{}
{
    ( Manifest() )*
    <INSTRUCTIONS>
    Statement()
    <EOF>
}

void Manifest() :
{}
{
    <MANIFEST> (Type())? PrimaryExpression()
}

void Statement() :
{}
{
    Instruction()
    |    Function()
}

void Instruction() :
{}
{
    (TypingList())* Identifier() <ASSIGN> Expression()
}

void TypingList() :
{}
{
    Type() ( TypingRest() )*
}

void TypingRest() :
{}
{
    <COMMA> Type()
}

void Type() :
{}
{
    <MAT>
    |    <NAT>
    |    <REAL>
}

void Function() :
{}
{
    <FUNCTION> Identifier() <LPAREN> (FormalParameterList())* <RPAREN> (<COLON> TypingList())? <LBRACE>
        Statement()
    <RBRACE>
}

void FormalParemeterList() :
{}
{
    FormalParameter() (FormalParameterRest() )*
}

void FormalParameter() :
{}
{
    (TypingList())* Identifier()
}

void FormalParameterRest() :
{}
{
    <COMMA> FormalParameter()
}

void Instruction() :
{}
{
    (TypingList())* Identifier() <ASSIGN> Expression()
}

void Identifier() :
{}
{
    <IDENTIFIER>
}

void Expression() :
{}
{
    <MIX> Identifier() <WITH> Identifier() <FOR> <INTEGER_LITERAL>
}

This should enable me to parse a simple program, such as:

manifest itemOne
manifest itemTwo

instructions

function doThis(argument) : nat {
    temp = mix one with two for 3
}

two = mix item3 with item4

However, when JavaCC sees the temp = mix... statement in the function doThis, it states that it found an identifier, but was expecting literally anything else: Exception in thread "main" parser.ParseException: Encountered " <IDENTIFIER> "temp "" at line x, column y. Was expecting one of: "for" ... "}" ...

However, as you can see, my syntax says that you can use an identifier to assign it the value of a mix. But the error is saying this is invalid or incorrect. I've tried several variants of this, but nothing is seems to work.

Upvotes: 1

Views: 204

Answers (1)

lilott8
lilott8

Reputation: 1116

The problem with this is that you are telling JavaCC that there is only one Instruction()|BranchStatement()|WhileStatement()|Function(). So once the parser visits one of those states it can no longer revisit there.

In order to fix this, put the Kleene Closure + around your Statement() transitions, e.g.:

void Statement() :
{}
{
    (
    Instruction()
    |    BranchStatement()
    |    WhileStatement()
    |    Function()
    )+
}

The + operator tells the automata that this state must be visited at least once, but can be repeated.

Upvotes: 2

Related Questions