superfluous LOOKAHEAD in javacc causes error?

Question

I have the following TT.jj, if I uncomment the SomethingElse part below, it successfully parses a language of the form create create blahblah or create blahblah. But if I comment out the SomethingElse part below, but retain the LOOKAHEAD, javacc complains that the lookahead is not necessary and "ignored", but the resulting parser only accepts an empty string.

I thought javacc said it's "ignored" so it should not take any effect ? basically a superfluous LOOKAHEAD causes error. How does that work exactly? maybe javacc's implementation of LOOKAHEAD is not exactly up to the spec ?

     options{
        IGNORE_CASE=true ;
        STATIC=false;
            DEBUG_PARSER=true;
        DEBUG_LOOKAHEAD=false;
        DEBUG_TOKEN_MANAGER=false;
    //  FORCE_LA_CHECK=true;
        UNICODE_INPUT=true;
    }

    PARSER_BEGIN(TT)

    import java.util.*;

    /**
     * The parser generated by JavaCC
     */
    public class TT {

    }

    PARSER_END(TT)


    ///////////////////////////////////////////// main stuff concerned
    void Statement() :
    { }
    {
    LOOKAHEAD(2)
    CreateTable()
    //|
    //SomethingElse()
    }

    void CreateTable():
    {
    }
    {
              
    }

    //void SomethingElse():
    //{}{
    //       
    //}
    //
    //////////////////////////////////////////////////////////


SKIP:
{
    " "
|   "	"
|   "
"
|   "
"
}

TOKEN: /* SQL Keywords. prefixed with K_ to avoid name clashes */
{

}


TOKEN : /* Numeric Constants */
{
   < S_DOUBLE: (()? "."  ( ["e","E"] (["+", "-"])? )?
                        |
                         "." (["e","E"] (["+", "-"])? )?
                        |
                         ["e","E"] (["+", "-"])? 
                        )>
  |     < S_LONG: (  )+ >
  |     < #DIGIT: ["0" - "9"] >
}


TOKEN:
{
        < S_IDENTIFIER: (  |  )+ (  |  |  | )* >
|       < #LETTER: ["a"-"z", "A"-"Z", "_", "$"] >
|   < #SPECIAL_CHARS: "$" | "_" | "#" | "@">
|   < S_CHAR_LITERAL: "'" (~["'"])* "'" ("'" (~["'"])* "'")*>
|   < S_QUOTED_IDENTIFIER: """ (~["
","
","""])+ """ | ("`" (~["
","
","`"])+ "`") | ( "[" ~["0"-"9","]"] (~["
","
","]"])* "]" ) >

/*
To deal with database names (columns, tables) using not only latin base characters, one
can expand the following rule to accept additional letters. Here is the addition of german umlauts.

There seems to be no way to recognize letters by an external function to allow
a configurable addition. One must rebuild JSqlParser with this new "Letterset".
*/
|   < #ADDITIONAL_LETTERS: ["ä","ö","ü","Ä","Ö","Ü","ß"] >
}

Theodore Norvell · Accepted Answer

The lookahead specification that JavaCC says it is ignoring is not ignored. Moral: Don't put lookahead specifications at nonchoice points.

In more detail. When a lookahead (other than a purely semantic lookahead) appears at a nonchoice point, it appears to generate a lookahead method that always returns false, therefor lookahead fails and, there being no other choice, an exception is thrown.

superfluous LOOKAHEAD in javacc causes error?

Answers (2)

Related Questions