Reputation: 3773
I'm trying to catch some text between parathesis with a semicolon in the end.
Example: (in here there can be 'anything' !"#¤);); any character is possible);
I've tried this:
Text
= "(" text:(.*) ");" { return text.join(""); }
But it seems (.*) will include the last ); before ");" does and I get the error:
Expected ");" or any character but end of input found
The problem is that the text can contain ");" so I want the outer most ); to descide when the line ends.
This regex \((.*)\);
does what I want, but how can I do the same in PEG.js? I don't want to include the outer parentheses and semicolon in the result.
This seems like it should be quite easy if you know what you're doing =P
Upvotes: 5
Views: 2178
Reputation: 2321
So, the point is that a PEG is deterministic, while a regex is not. So a PEG won't backtrack once it's accepted some input. We can then simulate the semantics you want. Since you say the regex \((.*)\);
does what you want, we might translate this to a PEG.
What does this regex do? It consumes all characters up to the end of the input, then keeps backtracking until it sees a );
, i.e., it consumes the last possible );
.
To make this work with a PEG, we might use a lookahead to keep consuming iff we have a );
ahead.
So, a solution is:
Text
= "(" text:TextUntilTerminator ");" { return text.join(""); }
TextUntilTerminator
= x:(&HaveTerminatorAhead .)* { return x.map(y => y[1]) }
HaveTerminatorAhead
= . (!");" .)* ");"
The TextUntilTerminator
non-terminal consumes while HaveTerminatorAhead
matches without consuming it (a lookahead, the &
symbol). Then it consumes one single character. It does so until it knows we've reached the final );
on the input.
The HaveTerminalAhead
non-terminal is simple: it verifies if there is one character ahead, and, if it does, garantees that there is at least one );
after it. We also use the negative-lookahead !
to stop at the first );
we see (avoid consuming it, which would reproduce your original problem).
This PEG, then, reproduces the behavior of the regex you suggested.
Upvotes: 14