Shahawn
Shahawn

Reputation: 31

Creating a DCG capable of parsing trees in Prolog

I have to create a DCG in Prolog with the following features:

  1. handle subject/object distinction
  2. singular/plural distinction
  3. capable of producing parse trees
  4. make use of a separate lexicon

Here's the given lexicon:

lex(the,det,_).
lex(a,det,singular).
lex(man,n,singular).
lex(men,n,plural).
lex(woman,n,singular).
lex(women,n,plural).
lex(apple,n,singular).
lex(apples,n,plural).
lex(pear,n,singular).
lex(pears,n,plural).

lex(eat,v,plural).
lex(eats,v,singular).
lex(know,v,plural).
lex(knows,v,singular).

lex(i,pronoun,singular,subject).
lex(we,pronoun,plural,subject).
lex(me,pronoun,singular,object).
lex(us,pronoun,plural,object).
lex(you,pronoun,_,_).
lex(he,pronoun,singular,subject).
lex(she,pronoun,singular,subject).
lex(him,pronoun,singular,object).
lex(her,pronoun,singular,object).
lex(they,pronoun,plural,subject).
lex(them,pronoun,plural,object).
lex(it,pronoun,singular,_).

And here's my code:

s(s(NP,VP)) --> np(NP,X,subject), vp(VP,X).

np(np(DET,N),X,_) --> det(DET,X), n(N,X).
np(np(PRO),X,Y) --> pro(PRO,X,Y).

vp(vp(V,NP),X) --> v(V,X), np(NP,_,object).
vp(vp(V),X) --> v(V,X).

det(det(DET),X) --> [DET], {lex(DET,det,X)}.

n(n(N),X) --> [N], {lex(N,n,X)}.

pro(pro(PRO),X,Y) --> [PRO], {lex(PRO,pro,X,Y)}.

v(v(V),X) --> [V], {lex(V,v,X)}.

When I input:

s(X, [the, man, eats, the, apple], []).

I should get:

X = s(np(det(the, singular), n(man, singular, subject)), vp(v(eats, singular), np(det(the, singular), n(apple, singular, object))))

But instead I get:

X = s(np(det(the), n(man)), vp(v(eats), np(det(the), n(apple)))) 

And I'm not sure why it's not outputting the full thing.

Upvotes: 2

Views: 1108

Answers (1)

Isabelle Newbie
Isabelle Newbie

Reputation: 9378

Calling DCGs like

rule(Args, List, Rest)

is somewhat "old style" still taught by many sources, but the more "modern" way is to use phrase/[2,3] instead:

?- phrase(s(Tree), [the, man, eats, the, apple]).
Tree = s(np(det(the), n(man)), vp(v(eats), np(det(the), n(apple)))) ;
false.

?- phrase(s(Tree), [the, man, eats, the, apple], Rest).
Tree = s(np(det(the), n(man)), vp(v(eats), np(det(the), n(apple)))),
Rest = [] ;
Tree = s(np(det(the), n(man)), vp(v(eats))),
Rest = [the, apple] ;
false.

The two-argument form saves you from specifying the rest as [] in the common case where you want a full parse. You also separate the arguments of the DCG rule from the list to be parsed. So in the phrase call, s takes a single argument for the syntax tree, just like in its definition.

As for your problem, the good thing is that Prolog is a very testable language. If something big -- like parsing a whole sentence -- goes wrong, we can break the problem down and test smaller bits -- like parsing a noun phrase, or just a noun.

So, breaking the subject down into smaller and smaller parts:

?- phrase(np(Tree, Number, Role), [the, man]).
Tree = np(det(the), n(man)),
Number = singular ;
false.

?- phrase(det(Tree, Role), [the]).
Tree = det(the).

?- phrase(n(Tree, Number), [man]).
Tree = n(man),
Number = singular.

You would like to parse the noun phrase to np(det(the, singular), n(man, singular, subject)), but the actual trees you get from det and n are already missing some of the extra arguments. You need to adjust these:

det(det(DET, Number), Number) --> [DET], {lex(DET, det, Number)}.

n(n(N, Number, _Role), Number) --> [N], {lex(N, n, Number)}.

With this you get:

?- phrase(det(Tree, Role), [the]).
Tree = det(the, Role).

?- phrase(n(Tree, Number), [man]).
Tree = n(man, singular, _2114),
Number = singular.

?- phrase(np(Tree, Number, Role), [the, man]).
Tree = np(det(the, singular), n(man, singular, _2178)),
Number = singular ;
false.

The parse for the whole sentence is now:

?- phrase(s(Tree), [the, man, eats, the, apple]).
Tree = s(np(det(the, singular), n(man, singular, _2210)), vp(v(eats), np(det(the, singular), n(apple, singular, _2240)))) ;
false.

The extra arguments on the determiners and nouns are there now. What's missing is to do the same for verbs and for noun phrases, so that the "role" (subject or object) is bound correctly.

Upvotes: 3

Related Questions