João M. S. Silva
João M. S. Silva

Reputation: 1148

How to parse C++ comments with lark?

How can I write a rule to parse C++ comments either on a line alone or after other code?

I've tried lots of combinations, the latest one being:

?comment: "//" /[^\n]*/ NEWLINE

Upvotes: 3

Views: 1874

Answers (3)

ay0ks
ay0ks

Reputation: 73

You simply define a terminal and then ignore it:

COMMENT : /\/\// /.*/
        | /\/\*/ /.*/ /\*\//

%ignore COMMENT

NOTE: This will work only if you'll ignore all whitespace

%import common.WS
%ignore WS

Upvotes: 2

Erez
Erez

Reputation: 1430

You had the right idea, but you should define comments as a single terminal (i.e. not a structure), for performance, and also so you can ignore them.

COMMENT: "//" /[^\n]*/ NEWLINE

%ignore COMMENT

Example grammar:

from lark import Lark

g = r"""
!start: "hello"

COMMENT: "//" /[^\n]*/ _NEWLINE
_NEWLINE: "\n"
%ignore COMMENT
%ignore " "
"""

parser = Lark(g)
print(parser.parse("hello // World \n"))

Upvotes: 3

João M. S. Silva
João M. S. Silva

Reputation: 1148

Using: ?comment: /\/\/[^\n]*/

Then I had to handle the comment as a lark.lexer.Token.

Upvotes: 0

Related Questions