Reputation: 41
I have the following piece of code and would like to exclude reserved words as identifiers in the | < ID : ()+("_")(<#DIGIT>)()* > . I understand that I can match one or more, zero or more but how can I possible exclude something from a regular expression. Any guidance would be greatly appreciated.
TOKEN : /* Numbers and identifiers */
{
< INT : (<DIGIT>)+ >
| < #DIGIT : ["0" - "9"] >
| < ID : (<LETTER>)+("_")*(<#DIGIT>)*(<LETTER>)* >
| < #LETTER : ["a" - "z", "A" - "Z"] >
}
TOKEN : { /* RESERVED WORDS */
<VARIABLE: "variable">
| <CONSTANT: "constant">
| <RETURN: "return">
| <INTEGER: "integer">
| <BOOLEAN: "boolean">
| <VOID: "void">
| <MAIN: "main">
| <IF: "if">
| <ELSE : "else">
| <TRUE: "true">
| <FALSE: "false">
| <WHILE: "while">
| <BEGIN: "begin">
| <END: "end">
| <IS: "is">
| <SKIP: "skip">
}
Upvotes: 2
Views: 503
Reputation: 16221
When two regular expressions both match the longest match, the first one wins. (See the JavaCC FAQ.)
So the solution is easy: Reorder the productions:
TOKEN : { /* RESERVED WORDS */
<VARIABLE: "variable">
| <CONSTANT: "constant">
| <RETURN: "return">
| <INTEGER: "integer">
| ...
}
TOKEN : /* Numbers and identifiers */
{
< INT : (<DIGIT>)+ >
| < #DIGIT : ["0" - "9"] >
| < ID : (<LETTER>)+("_")*(<#DIGIT>)*(<LETTER>)* >
| < #LETTER : ["a" - "z", "A" - "Z"] >
}
Upvotes: 2