David Williams
David Williams

Reputation: 25

Redefining Lexical Tokens With Javacc

I'm quite new to creating language syntax with Javacc and i need to find a way to allow the user to redefine the definition of a token in code.

For example, the line

REDEFINE IF FOO

Should change the Definition of "IF" from

  < IF: "IF" >

To

  < IF: "FOO" >

If this is not possible, what would be the best way of solving this problem?

Upvotes: 1

Views: 806

Answers (1)

Theodore Norvell
Theodore Norvell

Reputation: 16221

I think you can do it with a token action that changes the kind field of the token.

Something like the following. [Untested code follows. If you use it, please correct any errors in this answer.]

Make a token manager declaration of a hash map:

TOKEN_MGR_DECLS: {
    public java.util.HashMap<String,Integer> keywordMap = new java.util.HashMap<String,Integer>()  ;
    {   keywordMap.put( "IF", ...Constants.IF); }
}

Make a definition for identifiers.

TOKEN : { <ID : (["a"-"z","A"-"Z"])(["a"-"z","A"-"Z","0"-"9"])* >
               { if( keywordMap.containsKey( matchedToken.image ) ) {
                     matchedToken.kind = keywordMap.get( matchedToken.image ) ; }
               }
         }

Make definitions for the key words. These need to come after then definition of ID. Really these are just here so that the kinds are created. They will be unreachable and may cause warnings.

TOKEN : { <IF : "A"> | ... }

In the parser you need to define redefine

void redefine() :
{
    Token oldToken;
    Token newToken;
}
{
    <REDEFINE> oldToken=redefinableToken() newToken=redefinableToken() 
    {
        if( ...TokenManager.keywordMap.containsKey( oldToken.image ) ) {
            ...TokenManager.keywordMap.remove( oldToken.image ) ;
            ...TokenManager.keywordMap.add( newToken.image, oldToken.kind ) ; }
        else {
             report an error }
    }
}

Token redefinableToken() : 
{ Token t ; }
{
    t=<ID>  {return t ;}
|   t=<IF>  {return t ;}
| ...
}

See the FAQ (4.14) for warnings about trying to alter the behaviour of the lexer from the parser. Long story short: avoid lookahead.


Another approach is to simply have one token kind, say ID, and handle everything in the parser. See FAQ 4.19 on "Replacing keywords with semantic lookahead". Here lookahead will be less of a problem because semantic actions in the parser aren't executed during syntactic lookahead (FAQ 4.10).

Upvotes: 1

Related Questions