Reputation: 15525
I want to provide a parser for parsing so called Subversion config auth files (see patch based authorization in the Subversion red book). Here I want to define rules for directories like
[/]
* = r
[/trunk]
@PROJECT = rw
So the part of the grammar I have problems is the path definition. I currently have the following rules in Parslet:
rule(:auth_rule_head) { (str('[') >> path >> str(']') >> newline).as(:arh) }
rule(:top) { (str('/')).as(:top) }
rule(:path) { (top | ((str('/') >> path_ele).repeat)).as(:path) }
rule(:path_ele) { ((str('/').absent? >> any).repeat).as(:path_ele) }
So I want to divide in two cases:
[/]
(the root directory)[/<dir>]
which may be repeated, but has to end without a /
The problematic rule seems to be the path
that defines an alternative, here /
XOR something like /trunk
I have defined test cases for those, and get the following error when running the test case:
Failed to match sequence (SPACES '[' PATH ']' NEWLINE) at line 1 char 3.
`- Expected "]", but got "t" at line 1 char 3.
So the problem seems to be, that the alternative (rule :path) is chosen all the time top
.
What is a solution (as a grammar) for this problem? I think there should be a solution, and this looks like something idiomatic that should happen from here to there. I am not an expert at all with PEG parsers or parser / compiler generation, so if that is a base problem not solvable, I would like to know that as well.
Upvotes: 1
Views: 77
Reputation: 15525
Seems to be I have not got the problem right. I have tried to reproduce the problem in creating a small example grammar including some unit tests, but now, the thing is working.
If you are interested in it, have a look at the gist https://gist.github.com/mliebelt/a36ace0641e61f49d78f. You should be able to download the file, and run it directly from the command line. You have to have installed first parslet
, minitest
should be already included in a current Ruby version.
I have added there only the (missing) rule for newline
, and added 3 unit tests to test all cases:
/
/my
/my/path
Works like expected, so I get two cases here:
Perhaps this may help others how to debug a situation like that.
Upvotes: 0
Reputation: 21548
In short: Swap the OR conditions around.
Parlset rules consume the input stream until they get a match, then they stop. If you have two possible options (an OR), the first is tried, and only if it doesn't match is the second tried.
In your case, as all your paths start with '/' they all match the first part of the path rule, so the second half is never explored.
You need to try to match the full path first, and only match the 'top' if it fails.
# changing this
rule(:path) { (top | ((str('/') >> path_ele).repeat)).as(:path) }
# to this
rule(:path) { ((str('/') >> path_ele).repeat) | top).as(:path) }
# fixes your first problem :)
Also... Be careful of rules that can consume nothing being in a loop. Repeat by default is repeat(0). Usually it needs to be repeat (1).
rule(:path) { ((str('/') >> path_ele).repeat(1)) | top).as(:path) }
also...
Is "top" really a special case? All paths end in a "/", so top is just the zero length path.
rule(:path) { (path_ele.repeat(0) >> str('/')).as(:path) }
Or
rule(:path) { (str('/') >> path_ele.repeat(0)).as(:path) }
rule(:path_ele) { ((str('/').absent? >> any).repeat(0)).as(:path_ele) >> str('/') }
# assuming "//" is valid otherwise repeat(1)
Upvotes: 1