How to convert Earley recognizer to Earley Parser

Question

I have implemented an Earley Recognizer algorithm. I am having trouble figuring out how to get the parse tree from the chart. I have back pointers pointing to the rule that "generated" the rule being added to the chart, but I am taking this quite literally in that, for the current rule R1:

1) if it has a terminal after the dot, check the terminal against the input sentence, add the rule R2 to the next column where R2 is the same as R1 but with the dot shifted. The back pointer of R2 = the back pointer of R1.

2) if there is a non terminal after the dot, add new rules to the current column and the back pointers for each of the new rules point to R1

3) if there is nothing after the dot (completed rule), then R1 is complete, so we scan in the column (less than curr column) associated with R1, find all rules Rj that have the left hand side of R1 after the dot, add Rj to the current column but shift the dot, make the backpointer of Rj point to R1.

I don't think I am getting the right output, so I'm wondering if it's a problem with my logic. What needs to be done to the Earley recognizer to convert it to an Earley parser?

I have a print_parse method which recurses on the back pointers of the rules, but I don't think it produces the correct output. For the sentence

Papa ate the caviar

with (ignoring probabilities) grammar

1   ROOT    S
1   S   NP VP
0.8 NP  Det N
0.1 NP  NP PP
0.7 VP  V NP
0.3 VP  VP PP
1   PP  P NP
0.1 NP  Papa
0.5 N   caviar
0.5 N   spoon
1   V   ate
1   P   with
0.5 Det the
0.5 Det a

it generates:

(ROOT ['S'])(S ['NP', 'VP'])(NP ['Papa'])(S ['NP', 'VP'])(VP ['V', 'NP'])(V ['ate'])(VP ['V', 'NP'])(NP ['Det', 'N'])(Det ['the'])(NP ['Det', 'N'])(N ['caviar'])(ROOT ['S'])

I know the parse chart is correct however as I've checked it against doing the parse table manually (by hand). I know other questions have been asked about this, but they all point to papers and frankly they are difficult. I would really appreciate some help.

How to convert Earley recognizer to Earley Parser

Answers (1)

Basically you create four types of nodes:

A rough description of the algorithm:

Related Questions