bock.steve
bock.steve

Reputation: 231

Converting grammar into prolog

So I am trying to convert a grammar that defines variable definitions in a programming language. This is my first every prolog, and its very different from typical languages so I am confused. The grammar goes as follows:

S -> T S | T

T -> char F semicolon | int F semicolon

F -> id | id G

G -> comma F

So effectively it would return true for things like "char id semicolon" or "int id comma id semicolon char id semicolon".

I am trying to turn this into a prolog program to recognize this grammar. What I have so far is:

type([char|T],T).
type([int|T],T).
def([id|T], T).
com([comma|T], T).
semi([semicolon|T], T).

vardef(L,S) :-
  type(L,S1),
  def(S1,S2),
  comma(S2,S3),
  def(S3,S4),
  semi(S4,S).

variable_definition(L) :-
  vardef(L,[]).

However, this obviously only recognizes something that specifically "int/char id comma id semicolon". I don't know how to make it so it has a variable number of "id comma id" before a semicolon, or even have a full new variable definition after the first one. Other questions on this site about the same thing typically have to deal with grammars that are set in place like this, not ones that can have a variable amount of inputs.

EDIT: So the question is two-fold. First, how do I make it so it recognizes two different variable definitions, one right after the other. I assume I have to change the last line in order to complete this, but I am unsure how.

Second, how do I make it recognize a variable amount of "id"s followed by commas. So if I want it to recognize "char id semicolon" as well as "char id comma id semicolon".

Upvotes: 2

Views: 294

Answers (1)

lurker
lurker

Reputation: 58244

The most natural way to express a grammar like this in Prolog is using Prolog's DCG notation:

S -> T S | T
T -> char F semicolon | int F semicolon
F -> id | id G
G -> comma F

s --> t, s | t.
t --> [char], f, [semicolon] | [int], f, [semicolon].
f --> [id] | [id], g.
g --> [comma], f.

The nice thing about DCG is that it expresses the notation more directly. You can then use phrase/2 to run it:

| ?- phrase(s, [char, id, semicolon]).

true ? ;

no

You can with this grammar, to some extent, generate valid phrases:

| ?- phrase(t, S).

S = [char,id,semicolon] ? ;

S = [char,id,comma,id,semicolon] ? ;

S = [char,id,comma,id,comma,id,semicolon] ? ;
...

However...

| ?- phrase(s, S).

Fatal Error: local stack overflow (size: 16384 Kb, reached: 16384 Kb,
environment variable used: LOCALSZ)

The word s is defined in such a way that it doesn't terminate. We can fix this by moving the recursive case later:

s --> t | t, s.

Then:

| ?- phrase(s, S).

S = [char,id,semicolon] ? ;

S = [char,id,comma,id,semicolon] ? ;

S = [char,id,comma,id,comma,id,semicolon] ? ;
...

You can see how this is implemented in standard notation by listing the Prolog code for the predicate:

| ?- listing(t).

% file: user

t(A, B) :-
        (   A = [char|C],
            f(C, D),
            D = [semicolon|B]
        ;   A = [int|E],
            f(E, F),
            F = [semicolon|B]
        ).

yes
| ?-

You could write this more succinctly as:

t([char|T], B) :-
    f(T, [semicolon|B]). 
t([int|T], B) :-
    f(T, [semicolon|B]).

Which would be called as t(L, []) (the equivalent result as phrase(t, L)).


If we list the rest of the predicates, you can get a complete solution in the form you are asking for:

| ?- listing.
s(A, B) :-
        (   t(A, B)
        ;   t(A, C),
            s(C, B)
        ).

t(A, B) :-
        (   A = [char|C],
            f(C, D),
            D = [semicolon|B]
        ;   A = [int|E],
            f(E, F),
            F = [semicolon|B]
        ).

f(A, B) :-
        (   A = [id|B]
        ;   A = [id|C],
            g(C, B)
        ).

g([comma|A], B) :-
        f(A, B).

Refactoring slightly (making it less verbose):

s(L, S) :-
    t(L, S).
s(L, S) :-
    t(L, S1),
    s(S1, S).

t([char|T], S) :-
    f(T, [semicolon|S]). 
t([int|T], S) :-
    f(T, [semicolon|S]).

f([id|S], S).
f([id|S1], S) :-
    g(S1, S).

g([comma|S1], S) :-
    f(S1, S).

And from here you can call: variable_definition(D) :- s(D, []).

Upvotes: 5

Related Questions