Reputation: 7443
I have run into headaches trying to coerce a grammar to match the last line of a file if it is not followed by a newline:
Line 1
Line 2 EOF
This attempted solution, which makes the newline optional, causes an infinite loop:
my grammar HC4 {
token TOP { <line>+ }
token line { [ <header> | <not-header> ] \n? } # optional newline
token header { <header-start> <header-content> }
token not-header { <not-header-content> }
token header-start { \s* '#' ** 1..6 }
token header-content { \N* }
token not-header-content { \N* }
}
The \N*
bits will match the ''
string after the last character in the last line forever.
I have tried using <[\n\Z]>
but then the compiler complains and suggests using \n?$
which I tried but that does not work either. After a lot of trial and error, the only solution I discovered that works requires me to create a new <blank>
capture and to change the \N*
to \N+
:
my grammar HC3 {
token TOP { <line>+ }
token line { [ <header> | <blank> | <not-header> ] \n? }
token header { <header-start> <header-content> }
token blank { \h* <?[\n]> }
token not-header { <not-header-content> }
token header-start { \s* '#' ** 1..6 }
token header-content { \N+ }
token not-header-content { \N+ }
}
I'd like to know if there is a more straightforward accomplishing this, though. Thanks.
Upvotes: 4
Views: 286
Reputation: 22688
I believe the simplest solution is something like this:
grammar LineOriented {
token TOP {
<line>* %% \n
}
token line {
^^ \N*
}
}
Using %%
allows, but not requires, the last trailing line.
Upvotes: 1
Reputation: 7443
OK, after some investigation, I discovered the root cause of my woes:
This screenshot is from the IntelliJ IDE's Editor -> General settings. By default, the "Ensure every saved file ends with a line break" is not checked off. So if I saved a file with the very last line deleted to clean it up, it was stripping the last \n
character. Check that setting on to avoid my pain, suffering and deep psychological trauma.
Upvotes: 3
Reputation: 7443
I think I may have found something that can work and is simple:
my grammar G {
token TOP { (^^ <line>)+ }
token line { \N* \n? }
}
The ^^
symbol, for the beginning of a line, stops the infinite loop.
Upvotes: 4