Runtime Terror
Runtime Terror

Reputation: 6742

How to find FIRST and FOLLOW

I want to find FIRST and FOLLOW for the following CFG:

S -> O | M
M -> iEtMeM | a
O -> iEtS | iEtMeO
E -> b

I did the left factorization, so I get:

S -> O | M
M -> iEtMeM | a
O -> iEtO'
O'-> S | MeO
E -> b

The FIRST sets:

FIRST(S) = FIRST(O)|FIRST(M) = {i,a}
FIRST(M) = {i,a}
FIRST(O) = {i}
FIRST(O') = FIRST(S)|FIRST(M) = {i,a}
FIRST(E) = {b}

And now I cant find the FOLLOW set for S , because I don't know what FOLLOW(O') is:

FOLLOW(S) = {$, FOLLOW(O')}

Upvotes: 0

Views: 280

Answers (1)

chill
chill

Reputation: 16888

Actually FOLLOW(S) = {$} only.

So, I overlooked that S is mentioned on a right-hand side. Corrections below:

First we get the augmented grammar by adding the production S' ->S$, then FOLLOW(S') = {$}.

Then we have

  • from S' -> S$ and O' -> S

    FOLLOW(S) = FIRST($) + FOLLOW(O')

  • from M -> iEtMeM, O' -> MeO, and S -> M

    FOLLOW(M) = FIRST(eM) + FIRST(eO) + FOLLOW(S)

  • from S -> O and O' -> MeO

    FOLLOW(O) = FOLLOW(S) + FOLLOW(O')

  • from O -> iEtO'

    FOLLOW(O') = FOLLOW(O)

  • from M -> iEtMeM and O -> iEtO'

    FOLLOW(E) = FIRST(tMeM) + FIRST(tO')

The "problem" are the mutually recursive definitions for FOLLOW(S), FOLLOW(O), and `FOLLOW(O') - that means that each of these follow sets is a subset of the others, hence they are equal.

If one represent the set inclusion constraints, imposed by the above equations, as a graph (with non-terminal symbols as nodes), each set of mutually recursive definitions forms a strongly-connected component. Replacing each SCC with a new node will result in a DAG, representing a set of non-recursive equations, which can then by "evaluated" in topological order.

Say, we replace nodes, corresponding to symbols S, O and O' with node N. The equations become:

FOLLOW(N) = FIRST($) + FOLLOW(N)
FOLLOW(M) = FIRST(eM) + FIRST(eO) + FOLLOW(N)
FOLLOW(N) = FOLLOW(N) + FOLLOW(N)
FOLLOW(N) = FOLLOW(N)
FOLLOW(E) = FIRST(tMeM) + FIRST(tO')

and by cutting off the redundant parts:

FOLLOW(N) = FIRST($) = {$}
FOLLOW(M) = FIRST(eM) + FIRST(eO) + FOLLOW(N) = {e, $}
FOLLOW(E) = FIRST(tMeM) + FIRST(tO') = {t}

and, since N stands for either S, O, or O' we get:

FOLLOW(S`) = FOLLOW(S) = FOLLOW(O) = FOLLOW(O`) = {$}
FOLLOW(M) = {e, $}
FOLLOW(E) = {t}

Upvotes: 1

Related Questions