Reputation: 305
I'm starting with Lark
and got stuck on an issue with parsing special characters.
I have expressions given by a grammar. For example, these are valid expressions: Car{_}
, Apple3{3+}
, Dog{a_7}
, r2d2{A3*}
, A{+}
... More formally, they have form: name{feature}
where
name: CNAME
feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+
The definition of constants can be found here.
The problem is that the special characters are not present in produced tree (see example below). I have seen this answer, but it did not help me. I tried to place !
before special characters, escaping them. I also enabled keep_all_tokens, but this is not desired because then characters {
and }
are also present in the tree. Any ideas how to solve this problem? Thank you.
from lark import Lark
grammar = r"""
start: object
object : name "{" feature "}" | name
feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+
name: CNAME
%import common.LETTER
%import common.DIGIT
%import common.CNAME
%import common.WS
%ignore WS
"""
parser = Lark(grammar, parser='lalr',
lexer='standard',
propagate_positions=False,
maybe_placeholders=False
)
def test():
test_str = '''
Apple_3{3+}
'''
j = parser.parse(test_str)
print(j.pretty())
if __name__ == '__main__':
test()
The output looks like this:
start
object
name Apple_3
feature 3
instead of
start
object
name Apple_3
feature
3
+
Upvotes: 1
Views: 1765
Reputation: 4620
You said you tried placing !
before special characters. As I understand the question you linked, the !
has to be replaced before the rule:
!feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+
This produces your expected result for me:
start
object
name Apple_3
feature
3
+
Upvotes: 1