Parslet grammar for rules starting identical

Question

I want to provide a parser for parsing so called Subversion config auth files (see patch based authorization in the Subversion red book). Here I want to define rules for directories like

[/]
* = r
[/trunk]
@PROJECT = rw

So the part of the grammar I have problems is the path definition. I currently have the following rules in Parslet:

rule(:auth_rule_head) { (str('[') >> path >> str(']') >> newline).as(:arh) }
rule(:top)          { (str('/')).as(:top) }
rule(:path)         { (top | ((str('/') >> path_ele).repeat)).as(:path) }
rule(:path_ele)     { ((str('/').absent? >> any).repeat).as(:path_ele) }

So I want to divide in two cases:

To find only [/] (the root directory)
in all other cases [/] which may be repeated, but has to end without a /

The problematic rule seems to be the path that defines an alternative, here / XOR something like /trunk

I have defined test cases for those, and get the following error when running the test case:

Failed to match sequence (SPACES '[' PATH ']' NEWLINE) at line 1 char 3.
`- Expected "]", but got "t" at line 1 char 3.

So the problem seems to be, that the alternative (rule :path) is chosen all the time top.

What is a solution (as a grammar) for this problem? I think there should be a solution, and this looks like something idiomatic that should happen from here to there. I am not an expert at all with PEG parsers or parser / compiler generation, so if that is a base problem not solvable, I would like to know that as well.

mliebelt · Accepted Answer

Seems to be I have not got the problem right. I have tried to reproduce the problem in creating a small example grammar including some unit tests, but now, the thing is working.

If you are interested in it, have a look at the gist https://gist.github.com/mliebelt/a36ace0641e61f49d78f. You should be able to download the file, and run it directly from the command line. You have to have installed first parslet, minitest should be already included in a current Ruby version.

I have added there only the (missing) rule for newline, and added 3 unit tests to test all cases:

The root: /
A path with only one element: /my
A path with more than one element: /my/path

Works like expected, so I get two cases here:

Top element only
One or more path elements

Perhaps this may help others how to debug a situation like that.

Parslet grammar for rules starting identical

Answers (2)

Related Questions