pistacchio
pistacchio

Reputation: 58873

Perl6 grammars: match full line

I've just started exploring perl6 grammars. How can I make up a token "line" that matches everything between the beginning of a line and its end? I've tried the following without success:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS


grammar sample {
    token TOP {
        <line>
    }

    token line {
        ^^.*$$
    }
}

my $match = sample.parse($txt);

say $match<line>[0];

Upvotes: 9

Views: 395

Answers (4)

Christoph
Christoph

Reputation: 169573

Your original aproach can be made to work via

grammar sample {
    token TOP { <line>+ %% \n }
    token line { ^^ .*? $$ }
}

Personally, I would not try to anchor line and use \N instead as already suggested.

Upvotes: 8

Pierre VIGIER
Pierre VIGIER

Reputation: 156

I can see 2 problem in your Grammar here, the first one here is the token line, ^^ and $$ are anchor to start and end of line, howeve you can have new line in between. To illustrate, let's just use a simple regex, without Grammar first:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS

if $txt ~~ m/^^.*$$/ {
    say "match";
    say $/;
}

Running that, the output is:

match
「row 1
row 2
row 3」

You see that the regex match more that what is desired, however the first problem is not there, it is because of ratcheting, matching with a token will not work:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS

my regex r {^^.*$$};
if $txt ~~ &r {
    say "match regex";
    say $/;
} else {
    say "does not match regex";
}
my token t {^^.*$$};
if $txt ~~ &t {
    say "match token";
    say $/;
} else {
    say "does not match token";
}

Running that, the output is:

match regex
「row 1
row 2
row 3」
does not match token

I am not really sure why, but token and anchor $$ does not seems to work well together. But what you want instead is searching for everything except a newline, which is \N* The following grammar solve mostly your issue:

grammar sample {
    token TOP {<line>}
    token line {\N+}
}

However it only matches the first occurence, as you search for only one line, what you might want to do is searching for a line + an optional vertical whitespace (In your case, you have a new line at the end of your string, but i guess you would like to take the last line even if there is no new line at the end ), repeated several times:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS

grammar sample {
    token TOP {[<line>\v?]*}
    token line {\N+}
}

my $match = sample.parse($txt);
for $match<line> -> $l {
    say $l;
}

Output of that script begin:

「row 1」
「row 2」
「row 3」

Also to help you using and debugging Grammar, 2 really usefull modules : Grammar::Tracer and Grammar::Debugger . Just include them at the beginning of the script. Tracer show a colorful tree of the matching done by your Grammar. Debugger allows you to see it matching step by step in real time.

Upvotes: 11

CIAvash
CIAvash

Reputation: 716

my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS


grammar sample {
    token TOP {
        <line>+
    }
    token line {
        \N+ \n
    }
}

my $match = sample.parse($txt);

say $match<line>[0];

Or if you can be specific about the line:

grammar sample {
    token TOP {
        <line>+
    }
    rule line {
        \w+ \d
    }
}

Upvotes: 2

Michael D. Hensley
Michael D. Hensley

Reputation: 3

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS

grammar sample {
    token TOP { <line> }
    token line { .* }
}

for $txt.lines -> $line {
    ## An single line of text....
    say $line;
    ## Parse line of text to find match obj...
    my $match = sample.parse($line);
    say $match<line>;
}

Upvotes: -3

Related Questions