Reputation: 1540
I'm trying to create a grammar. Here's my code so far:
use Text::Table::Simple; # zef install Text::Table::Simple
my $desc = q:to"FIN";
record person
name string;
age int;
end-record
FIN
grammar rec {
token TOP { <ws>* 'record' \s+ <rec-name> <field-descriptors> <ws> 'end-record' <ws> }
token rec-name { \S+ }
token field-descriptors { <field-descriptor>* }
token field-descriptor { <ws>* <field-name> <ws>+ <field-type> <ws>* ';' }
token field-name { \S+ }
token field-type { <[a..z]>+ }
token ws { <[\r\n\t\ ]> }
}
class recActions {
method field-descriptors($/) { $/.make: $/; }
method field-descriptor($/) { $/.make: $/; }
method field-name($/) { $/.make: $/ }
method field-type($/) { $/.make: $/ }
}
my $r = rec.parse($desc, :actions(recActions));
#say $r;
my $inp = q:to"FIN";
adam 26
joe 23
mark 51
FIN
sub splitter($line) {
my @lst = split /\s+/, $line;
}
sub matrixify(&splitter, $data)
{
my @d = (split /\n/, (trim-trailing $data)).map( -> $x { splitter $x ; } );
#@d.say;
#my @cols = <name age>;
#say lol2table(@cols, @d).join("\n");
@d;
}
#my @cols =<A B>;
#my @rows = ([1,2], [3,4]);
#say lol2table(@cols, @rows).join("\n");
my @m = matrixify &splitter, $inp;
sub tabulate($rec-desc, @matrix)
{
my $fds = $rec-desc<field-descriptors>;
#say %fds<field-name>;
say $fds;
my @cols = $rec-desc.<field-descriptors>.map( -> $fd { say $fd; $fd.<field-name> ; 1;} );
#say $rec-desc.<field-descriptors>;
#say @cols;
}
tabulate $r, @m ;
I really just want the grammar to create a tree of lists/hash tables from the input. The output from the code is:
「
name string;
age int;」
field-descriptor => 「
name string;」
ws => 「
」
ws => 「 」
field-name => 「name」
ws => 「 」
field-type => 「string」
field-descriptor => 「
age int;」
ws => 「
」
ws => 「 」
field-name => 「age」
ws => 「 」
ws => 「 」
field-type => 「int」
which looks fairly good. perl6 seems to be decoding the fact that field-descriptors
is composed of multiple field-descriptor
, but it doesn't actually seem to put them into a list. I can do say $fds;
, but I can't do say $fds[0];
. Why does the former "work", but the latter doesn't?
I must admit to having a fairly weak grasp of what's going on. Would I be better of using rules instead of tokens? Do I really need an actions class; can't I just get perl to "automagically" populate the parse tree for me without having to specify a class of actions?
Update: possible solution
Suppose we just want to parse:
my $desc = q:to"FIN";
record person
name string;
age int;
end-record
FIN
and report on the field names and types that we find. I'm going to make a slight simplification to the grammar I wrote above:
grammar rec {
token TOP { <ws>* 'record' \s+ <rec-name> <field-descriptor>+ <ws> 'end-record' <ws> }
token rec-name { \S+ }
token field-descriptor { <ws>* <field-name> <ws>+ <field-type> <ws>* ';' }
token field-name { \S+ }
token field-type { <[a..z]>+ }
token ws { <[\r\n\t\ ]> }
}
Let's eschew actions completely, and just parse it into a tree:
my $r1 = rec.parse($desc);
Let's now inspect our handiwork, and print out the name and type for each field that we have parsed:
for $r1<field-descriptor> -> $fd { say "Name: $fd<field-name>, Type: $fd<field-type>"; }
Our output is as we expect:
Name: name, Type: string
Name: age, Type: int
Upvotes: 2
Views: 371
Reputation: 32414
I know you're now all set but here's an answer to wrap things up for others reading things later.
How do I just create a parse tree with perl6 grammar?
It's as simple as it can get: just use the return value from calling one of the built in parsing routines.
(Provided parsing is successful parse
and cousins return a parse tree.)
The output from the code ... looks fairly good. perl6 seems to be decoding the fact that field-descriptors is composed of multiple field-descriptor, but it doesn't actually seem to put them into a list. I can do say $fds;, but I can't do say $fds[0];. Why does the former "work", but the latter doesn't?
See my answer to the SO question "How do I access the captures within a match?".
Would I be better of using rules instead of tokens?
The only difference between a token and a rule is the default for interpreting bare whitespace that you include within the token/rule.
(Bare whitespace within a token is completely ignored. Bare whitespace within a rule denotes "there can be whitespace at this point in the input".)
Do I really need an actions class[?]
No.
Only bother with an actions class if you want to systematically post process the parse tree.
can't I just get perl to "automagically" populate the parse tree for me without having to specify a class of actions?
Yes. Any time you call parse
and the parse is successful its return value is a parse tree.
Update: possible solution
Let's eschew actions completely, and just parse it into a tree:
Right. If all you want is the parse tree then you don't need an actions class and you don't need to call make
or made
.
Conversely, if you want another tree, such as an Abstract Syntax Tree, then you will probably find it convenient to use the built in make
and made
routines. And if you use make
and made
you may well find it appropriate to use them in conjunction with a separate actions class rather than just embedding them directly in the grammar's rules/tokens/regexes.
Upvotes: 2