StevieD
StevieD

Reputation: 7443

How do I match a newline or the end of a file in a Raku grammar?

I have run into headaches trying to coerce a grammar to match the last line of a file if it is not followed by a newline:

Line 1
Line 2 EOF

This attempted solution, which makes the newline optional, causes an infinite loop:

my grammar HC4 {
    token TOP {  <line>+ }
    token line { [ <header> | <not-header> ] \n? } # optional newline

    token header { <header-start> <header-content> }
    token not-header { <not-header-content> }
    token header-start { \s* '#' ** 1..6 }
    token header-content { \N* }
    token not-header-content { \N* }
}

The \N* bits will match the '' string after the last character in the last line forever.

I have tried using <[\n\Z]> but then the compiler complains and suggests using \n?$ which I tried but that does not work either. After a lot of trial and error, the only solution I discovered that works requires me to create a new <blank> capture and to change the \N* to \N+:

my grammar HC3 {
    token TOP {  <line>+ }
    token line { [ <header> | <blank> | <not-header> ] \n? }

    token header { <header-start> <header-content> }
    token blank { \h* <?[\n]> }
    token not-header { <not-header-content> }
    token header-start { \s* '#' ** 1..6 }
    token header-content { \N+ }
    token not-header-content { \N+ }
}

I'd like to know if there is a more straightforward accomplishing this, though. Thanks.

Upvotes: 4

Views: 286

Answers (3)

VZ.
VZ.

Reputation: 22688

I believe the simplest solution is something like this:

grammar LineOriented {
    token TOP {
        <line>* %% \n
    }

    token line {
        ^^ \N*
    }
}

Using %% allows, but not requires, the last trailing line.

Upvotes: 1

StevieD
StevieD

Reputation: 7443

OK, after some investigation, I discovered the root cause of my woes:

enter image description here

This screenshot is from the IntelliJ IDE's Editor -> General settings. By default, the "Ensure every saved file ends with a line break" is not checked off. So if I saved a file with the very last line deleted to clean it up, it was stripping the last \n character. Check that setting on to avoid my pain, suffering and deep psychological trauma.

Upvotes: 3

StevieD
StevieD

Reputation: 7443

I think I may have found something that can work and is simple:

my grammar G {
    token TOP {  (^^ <line>)+ }
    token line { \N* \n? }
}

The ^^ symbol, for the beginning of a line, stops the infinite loop.

Upvotes: 4

Related Questions