blippy
blippy

Reputation: 1540

How do I just create a parse tree with perl6 grammar?

I'm trying to create a grammar. Here's my code so far:

use Text::Table::Simple; # zef install Text::Table::Simple

my $desc = q:to"FIN";
record person
        name string;
        age  int;
end-record
FIN

grammar rec {
        token TOP { <ws>* 'record' \s+ <rec-name> <field-descriptors> <ws> 'end-record' <ws> }  
        token rec-name { \S+ }
        token field-descriptors { <field-descriptor>* }
        token field-descriptor { <ws>* <field-name> <ws>+ <field-type> <ws>* ';' }
        token field-name { \S+ }
        token field-type { <[a..z]>+ }
        token ws { <[\r\n\t\ ]> }
}


class recActions {
        method field-descriptors($/) { $/.make: $/; }
        method field-descriptor($/) { $/.make: $/; }
        method field-name($/) { $/.make: $/ }
        method field-type($/) { $/.make: $/ }
}

my $r = rec.parse($desc, :actions(recActions));
#say $r;

my $inp = q:to"FIN";
adam    26
joe     23
mark    51
FIN

sub splitter($line) { 
        my @lst = split /\s+/, $line; 
}


sub matrixify(&splitter, $data)
{
        my @d = (split /\n/, (trim-trailing $data)).map( -> $x { splitter $x ; } );
        #@d.say;
        #my @cols = <name age>;
        #say lol2table(@cols, @d).join("\n");
        @d;
}

#my @cols =<A B>;
#my @rows = ([1,2], [3,4]);
#say lol2table(@cols, @rows).join("\n");

my @m = matrixify &splitter, $inp;

sub tabulate($rec-desc, @matrix)
{
        my $fds = $rec-desc<field-descriptors>;
        #say %fds<field-name>;
        say $fds;
        my @cols = $rec-desc.<field-descriptors>.map( -> $fd { say $fd; $fd.<field-name> ; 1;} );
        #say $rec-desc.<field-descriptors>;
        #say @cols;
}
tabulate $r, @m ;

I really just want the grammar to create a tree of lists/hash tables from the input. The output from the code is:

「
    name string;
    age  int;」
 field-descriptor => 「
    name string;」
  ws => 「
」
  ws => 「   」
  field-name => 「name」
  ws => 「 」
  field-type => 「string」
 field-descriptor => 「
    age  int;」
  ws => 「
」
  ws => 「   」
  field-name => 「age」
  ws => 「 」
  ws => 「 」
  field-type => 「int」

which looks fairly good. perl6 seems to be decoding the fact that field-descriptors is composed of multiple field-descriptor, but it doesn't actually seem to put them into a list. I can do say $fds;, but I can't do say $fds[0];. Why does the former "work", but the latter doesn't?

I must admit to having a fairly weak grasp of what's going on. Would I be better of using rules instead of tokens? Do I really need an actions class; can't I just get perl to "automagically" populate the parse tree for me without having to specify a class of actions?

Update: possible solution

Suppose we just want to parse:

my $desc = q:to"FIN";
record person
    name string;
    age  int;
end-record
FIN

and report on the field names and types that we find. I'm going to make a slight simplification to the grammar I wrote above:

grammar rec {
    token TOP { <ws>* 'record' \s+ <rec-name> <field-descriptor>+ <ws> 'end-record' <ws> }  
    token rec-name { \S+ }
    token field-descriptor { <ws>* <field-name> <ws>+ <field-type> <ws>* ';' }
    token field-name { \S+ }
    token field-type { <[a..z]>+ }
    token ws { <[\r\n\t\ ]> }
}

Let's eschew actions completely, and just parse it into a tree:

my $r1 = rec.parse($desc);

Let's now inspect our handiwork, and print out the name and type for each field that we have parsed:

for $r1<field-descriptor> -> $fd { say "Name: $fd<field-name>, Type: $fd<field-type>"; }

Our output is as we expect:

Name: name, Type: string
Name: age, Type: int

Upvotes: 2

Views: 371

Answers (1)

raiph
raiph

Reputation: 32414

I know you're now all set but here's an answer to wrap things up for others reading things later.

How do I just create a parse tree with perl6 grammar?

It's as simple as it can get: just use the return value from calling one of the built in parsing routines.

(Provided parsing is successful parse and cousins return a parse tree.)

The output from the code ... looks fairly good. perl6 seems to be decoding the fact that field-descriptors is composed of multiple field-descriptor, but it doesn't actually seem to put them into a list. I can do say $fds;, but I can't do say $fds[0];. Why does the former "work", but the latter doesn't?

See my answer to the SO question "How do I access the captures within a match?".

Would I be better of using rules instead of tokens?

The only difference between a token and a rule is the default for interpreting bare whitespace that you include within the token/rule.

(Bare whitespace within a token is completely ignored. Bare whitespace within a rule denotes "there can be whitespace at this point in the input".)

Do I really need an actions class[?]

No.

Only bother with an actions class if you want to systematically post process the parse tree.

can't I just get perl to "automagically" populate the parse tree for me without having to specify a class of actions?

Yes. Any time you call parse and the parse is successful its return value is a parse tree.

Update: possible solution

Let's eschew actions completely, and just parse it into a tree:

Right. If all you want is the parse tree then you don't need an actions class and you don't need to call make or made.

Conversely, if you want another tree, such as an Abstract Syntax Tree, then you will probably find it convenient to use the built in make and made routines. And if you use make and made you may well find it appropriate to use them in conjunction with a separate actions class rather than just embedding them directly in the grammar's rules/tokens/regexes.

Upvotes: 2

Related Questions