Seiten
Seiten

Reputation: 138

Multiline comments in JavaCC

I'm trying to make a scanner for Javascript with JavaCC. I have several problems, one of which is C-style comments: /* … */ I need to return the comments as tokens.

Here is one attempt:

TOKEN: {<MLCOMMENT:          "/*"        ( ~["*"] | ("*"(~["/"])?) )* "*/">}
TOKEN: {<MLCOMMENT_UNDELIM: ("/*"|"/*/") ( ~["/"] | (~["*"]"/")    )*     >}

MLCOMMENT was intended to match closed comments, and MLCOMMENT_UNDELIM open-ended comments. This doesn't work becuase /*a*/b*/ is a longer match to MLCOMMENT than /*a*/.

Here is another attempt at solving this problem:

MORE:
{
    "/*" : WithinMLComment
}
< WithinMLComment > TOKEN :
{
    < MLCOMMENT: "*/" > : DEFAULT
}
< WithinMLComment > MORE :
{
    < ~[] >
}

This doesn't work either since an open-ended comment would cause EOF in the WithinMLComment state. That's illegal (TokenMgrError is thrown).

Update: I may have found the solution:

TOKEN: {<MLCOMMENT:         ("/*"|"/*/") ( ~["/"] | (~["*"]"/") )* "*/">}
TOKEN: {<MLCOMMENT_UNDELIM: ("/*"|"/*/") ( ~["/"] | (~["*"]"/") )*     >}

Update 2: It wasn't the solution. /**// will be matched by MLCOMMENT_UNDELIM.

Upvotes: 2

Views: 2273

Answers (1)

Theodore Norvell
Theodore Norvell

Reputation: 16221

For a multiline comment you can use

"/*" (~["*"])* "*" (~["*","/"] (~["*"])* "*" | "*")* "/"

For a multiline comment that is missing the final "*/", you can use

"/*" ( ~["*"] | ("*")+ ~["*","/"] )* ("*")*

Upvotes: 5

Related Questions