Matho
Matho

Reputation: 305

Lark: parsing special characters

I'm starting with Lark and got stuck on an issue with parsing special characters.

I have expressions given by a grammar. For example, these are valid expressions: Car{_}, Apple3{3+}, Dog{a_7}, r2d2{A3*}, A{+}... More formally, they have form: name{feature} where

The definition of constants can be found here.

The problem is that the special characters are not present in produced tree (see example below). I have seen this answer, but it did not help me. I tried to place ! before special characters, escaping them. I also enabled keep_all_tokens, but this is not desired because then characters { and } are also present in the tree. Any ideas how to solve this problem? Thank you.

from lark import Lark

grammar = r"""
    start: object

    object : name "{" feature "}" | name

    feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+
    name: CNAME

    %import common.LETTER
    %import common.DIGIT
    %import common.CNAME
    %import common.WS
    %ignore WS
"""

parser = Lark(grammar, parser='lalr',
                   lexer='standard',
                   propagate_positions=False,
                   maybe_placeholders=False
                   )
def test():
    test_str = '''
        Apple_3{3+}
    '''

    j = parser.parse(test_str)
    print(j.pretty())

if __name__ == '__main__':
    test()

The output looks like this:

start
  object
    name    Apple_3
    feature 3

instead of

start
  object
    name    Apple_3
    feature 
      3
      +

Upvotes: 1

Views: 1765

Answers (1)

You said you tried placing ! before special characters. As I understand the question you linked, the ! has to be replaced before the rule:

!feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+

This produces your expected result for me:

start
  object
    name    Apple_3
    feature
      3
      +

Upvotes: 1

Related Questions