JCLL
JCLL

Reputation: 5555

LL(*) versus PEG parsers : what is the difference?

I am wondering if ANTLR v3, which presents its internal parsing algorithm as "LL(*)" is fully representative of PEG (parsing expression grammar) parsers.

Is there a difference ?

Upvotes: 15

Views: 7735

Answers (4)

luikore
luikore

Reputation: 770

The ANTLR article was wrong about PEG.

LL(*) is a subset of DCFG (Deterministic Context Free Grammar), which is a subset of CFG (Context Free Grammar).

PEG can express context sensitive grammars like A{n}B{n}C{n}, in which A and B and C all occur n times. Here's the definition:

s := &(x C) A+ y / ε
x := A x B / A B
y := B y C / B C

But there is no way to define such grammar in CFG (the proof involves pump lemma). So PEG is not subset CFG. Whether PEG is superset of CFG? I don't know.

Two key differences between LL(*) and PEG:

  1. LL(*) can only lookahead a DFA pattern, while PEG can lookahead a recursive pattern. For example, in PEG you can lookahead nested parens while LL(*) can't.

  2. The choice operator / in PEG is priority choice (or "possessive"), which means if you have rule A / AB, it never reaches the right hand side AB. In LL(*) for a rule A | AB, it is possible to match AB.

If you have a PEG grammar without a lookahead, or your lookahead pattern can be reduced into a DFA, then it can be translated into LL(*). If not, it is not possible.

Upvotes: 10

vldmrrdjcc
vldmrrdjcc

Reputation: 2132

ANTLR and PEG are not the same. It is pretty theoretical question, and I think it will be the best for you to refer to this paper wrote by Terrence Parr where he exactly points the differences between ANTLR and PEG and some advantages of ANTLR LL(*) parsing strategy. I don't want to give myself freedom to paraphrase what he wrote there, but it is better for you to read the whole paper.

Upvotes: 3

Bart Kiers
Bart Kiers

Reputation: 170278

In ANTLR you can enable global backtracking on all production rules in your grammar, so that for k >= 1 you could parse pretty much the same as PEG's. Of course, because of all the potential backtracking, the run-time of the parser degrades. At the cost of (some) memory you could also enable memoization, making it behave like a Packrat-parser, able to parse the input in linear time.

So no, there's not much difference w.r.t ANTLR and PEG/Packrat (with the right options enabled!).

Upvotes: 4

H-Man2
H-Man2

Reputation: 3189

According to the tools listed here ANTLR is a full representative of a PEG parser:

ANTLR, a well-established parser generator by Terence Parr, supports extensive PEG features and combines packrat parsing with LL parsing techniques.

Upvotes: 1

Related Questions