Trident D'Gao
Trident D'Gao

Reputation: 19762

what is wrong with this peg grammar?

the following grammar (from RFC 2396):

domainlabel = 'a' / ('a' ('a' / '-')* 'a')

cannot parse this:

aa

why?

Upvotes: 0

Views: 129

Answers (1)

rici
rici

Reputation: 241911

Because PEG is not BNF. The use of / rather than the usual BNF alternation operator | (as seen in RFC 2396) was a deliberate attempt to avoid confusion (although it doesn't help that older standards such as RFC 822 also used /).

In a PEG, / is the "ordered-choice" operator. Unlike the BNF alternation operator, / is not symmetrical. If the first alternative succeeds, it is accepted. Only if the first alternative fails does the PEG backtrack and try the second alternative.

So when 'a' / ('a' ('a' / '-')* 'a') is applied to aa, the first alternative successfully absorbs the first a and the second alternative is never attempted. The fact that the parse subsequently fails to match the entire string does not matter; / only backtracks if matching the first alternative itself fails, not if some subsequent part of the parse fails.

In short, if you use PEGs, you need to be careful to write your alternations in the correct order.

Upvotes: 1

Related Questions