Reputation: 81
I am trying to use PEG expression to take parse the file. My PEG expression is:
WHITESPACE = _{" "}
level = {ASCII_DIGIT*}
verb = {ASCII_ALPHA{,4}}
value = {ASCII_ALPHANUMERIC*}
structure = { level ~ verb ~ value }
file = { SOI ~ (structure? ~ NEWLINE)* ~ EOI }
I parse this text:
0 HEAD
1 VERB test
2 STOP
file parse text successfully only, if I have an extra \n at the end of the text. If I remove the \n, parse fails due to 'expected EOI'. I understood that this happens, because of my rule for file. I tried to use different rules for file and got infinite loop. So, practically I don't know how to solve this issue. I am using rust and latest pest.
Upvotes: 1
Views: 350
Reputation: 4132
This seems to work. It can handle arbitrary number of newlines at the beginning or end as well:
file = { SOI ~ NEWLINE* ~ structure ~ (NEWLINE ~ structure)* NEWLINE* ~ EOI }
WHITESPACE = _{" "}
level = {ASCII_DIGIT+}
verb = {ASCII_ALPHA{1,4}}
value = {ASCII_ALPHANUMERIC*}
structure = { level ~ verb ~ value }
Upvotes: 1
Reputation: 81
WHITESPACE = _{ " " }
level = {ASCII_DIGIT+}
verb = {ASCII_ALPHA{,4}}
value = {ASCII_ALPHANUMERIC*}
stop = { level ~ "STOP" }
structure = { level ~ verb ~ value }
line = {structure | trlr}
file = { SOI ~ (line ~ NEWLINE?)* ~ EOI }
Checked on https://pest.rs/
Upvotes: -1
Reputation: 8544
I changed the rules to
level = {ASCII_DIGIT+}
verb = {ASCII_ALPHA{1,4}}
file = { SOI ~ (structure? ~ NEWLINE)* ~ structure? ~ EOI }
and that seemed to work just fine, regardless of the trailing newline. But maybe I overlooked something. If you could edit your question to show the rules and input that caused an infinite loop with this, that'd be great.
Upvotes: 0