Henrique Inonhe
Henrique Inonhe

Reputation: 175

Is it possible to deal with left recursion in LL by checking the length of the sentential form and discarding infinitely ambiguous grammars?

I understand that an LL parser (no lookahead) cannot deal with left recursive rules due to the fact it will keep on predicting the left recursive non terminal over and over again and would never be able to match.

But what if we adopted the combination of 2 strategies:

  1. We build a table where we assign each non terminal the minimum length of the strings its sublanguage generates.

So, suppose we have a left recursive rule as follows:

A -> As | a

Where s is a string composed of terminals and non terminals.

If either s generates strings with minimum length greater than 0, then even though when faced with the non terminal "A" in a prediction we would be predicting "A s" over and over again, the minimum length of the predicted string would only grow, eventually surpassing the length of the input string and therefore being discarded.

Of course this wouldn't work with grammars that are infinitely ambiguous (like the ones that have loops), leading us to the second strategy:

  1. Each time we make a prediction either the minimum length of the whole sentential form grows or it stays the same, in the case where it stays the same we keep a stack of the leftmost predicted non terminals, and whenever we predict the same non terminal twice, we discard the prediction.

This effectively discards infinitely many derivations, though it preserves the language accepted by the parser.


As a concrete example, suppose the following grammar:

S -> Sa | a

The minimum length of the strings generated by the non terminal S is 1.

Now suppose we're parsing the following input string: aa.

  1. We start by predicting the start symbol, "S".
  2. Using breadth-first search, we then have 2 predictions: "Sa" and "a".
  3. The "a" prediction is discarded since even though it matches the first "a" in the input string it reaches EOL while there is still an "a" left in the input string.
  4. The "Sa" prediction cannot match anything since we have "S" as the "matching token" in the sentential form, thus we make other 2 predictions: "Saa" and "aa".
  5. The "aa" prediction matches the entire input string and thus we have found a parsing.
  6. The "Saa" prediction has minimum length 3 and therefore cannot match the input string and is discarded.

Would this work? Am I missing something?

Upvotes: 2

Views: 450

Answers (0)

Related Questions