YiLuo
YiLuo

Reputation: 117

matching patterns in matrix using prolog dcg

Is it possible with DCGs to extract some 2D list, such that it can be represented by a context free grammar

S -> [ A , B ]

A -> [0,0,0,0,0]
A -> NULL

B -> [1,1,1,1,1]
B -> NULL

for example:

[[0,0,0,0,0], [1,1,1,1,1]] is valid
[[1,1,1,1,1]] is valid, where A is NULL.
[[0,0,0,0,0]] is valid, where B is NULL.

I tried something like this

zeros --> [].
zeros --> [0,0,0,0,0].
ones --> [].
ones --> [1,1,1,1,1]
matrix --> [A, B],
    {phrase(zeros, A)},
    {phrase(ones, B)}.

But this is not going to work the way I wanted it to, because in this case, the "compiler" thought I want an empty list '[]' instead of NULL.

so [[], [1,1,1,1,1]] will work while [[1,1,1,1,1]] is not.

How do I approach this?

Upvotes: 1

Views: 169

Answers (2)

CapelliC
CapelliC

Reputation: 60034

DCG notation reserves lists in production to represent sequences of 'tokens'. Then, your production zeros - for instance - will match a sequence of five zeroes, not a list of five zeroes. There is some confusion here, just because your target language (a sequence of lists) uses the metalanguage notation (Prolog lists indicate sequences of terminals in DCG productions). I think you could write it simply

zeros --> [ [0,0,0,0,0] ].
ones --> [ [1,1,1,1,1] ].

matrix --> (zeros ; ones), matrix ; [].

test :- phrase(matrix, [ [1,1,1,1,1],[0,0,0,0,0] ]).

Upvotes: 1

Isabelle Newbie
Isabelle Newbie

Reputation: 9378

The problem is that once you write matrix --> [A, B], that rule will definitely generate a two-element list, no matter what A and B are.

So you want to alternatively generate one-element lists [A] or [B]. You could do this explicitly:

a --> [0, 0, 0, 0, 0].

b --> [1, 1, 1, 1, 1].

matrix -->
    [A],
    { phrase(a, A) }.
matrix -->
    [B],
    { phrase(b, B) }.
matrix -->
    [A, B],
    { phrase(a, A) },
    { phrase(b, B) }.

This works:

?- phrase(matrix, Matrix).
Matrix = [[0, 0, 0, 0, 0]] ;
Matrix = [[1, 1, 1, 1, 1]] ;
Matrix = [[0, 0, 0, 0, 0], [1, 1, 1, 1, 1]].

But this is a lot of typing, and it's not very flexible if you want to extend it.

So let's try to generalize the fixed [A, B] bit. As a first step, we can use a list//1 DCG that just describes its own argument list:

list([]) -->
    [].
list([X|Xs]) -->
    [X],
    list(Xs).

We can use this as follows:

?- phrase(list([a, b, c]), Xs).
Xs = [a, b, c].

And use it to define a matrix:

matrix_with_list -->
    list([A, B]),
    { phrase(a, A) },
    { phrase(b, B) }.

This looks like we haven't made progress yet:

?- phrase(matrix_with_list, Matrix).
Matrix = [[0, 0, 0, 0, 0], [1, 1, 1, 1, 1]].

But now we can change list//1 a bit to only describe a sublist of its argument list:

optional_list([]) -->
    [].
optional_list([_X|Xs]) -->
    % skip this X!
    optional_list(Xs).
optional_list([X|Xs]) -->
    % keep this X
    [X],
    optional_list(Xs).

This behaves as follows:

?- phrase(optional_list([a, b, c]), Xs).
Xs = [] ;
Xs = [c] ;
Xs = [b] ;
Xs = [b, c] ;
Xs = [a] ;
Xs = [a, c] ;
Xs = [a, b] ;
Xs = [a, b, c].

Now we can adapt the previous definition:

matrix_with_optional_list -->
    optional_list([A, B]),
    { phrase(a, A) },
    { phrase(b, B) }.

And we get:

?- phrase(matrix_with_optional_list, Matrix).
Matrix = [] ;
Matrix = [[1, 1, 1, 1, 1]] ;
Matrix = [[0, 0, 0, 0, 0]] ;
Matrix = [[0, 0, 0, 0, 0], [1, 1, 1, 1, 1]].

Pretty good! But it's not great to have all those phrase/2 calls even if they refer to elements that do not end up in the matrix.

So let's generalize some more, to a DCG whose argument is a list of DCGs, and that describes a sublist of the list of lists described by those DCGs:

optional_phrase([]) -->
    [].
optional_phrase([_Rule|Rules]) -->
    % skip this rule
    optional_phrase(Rules).
optional_phrase([Rule|Rules]) -->
    % apply this rule
    [List],
    { phrase(Rule, List) },
    optional_phrase(Rules).

The main insight here was that you can use phrase/2 in a "higher-order" manner, where its first argument is not a literal atom (or functor term) naming a DCG, but a variable bound to such an atom or term. However, you must ensure that these variables are really bound when you apply this rule.

With this the final definition of the matrix is just:

matrix_with_optional_phrase -->
    optional_phrase([a, b]).

This now enumerates matrices as before, but it only ever executes phrase/2 for elements that are actually part of the matrix:

?- phrase(matrix_with_optional_phrase, Matrix).
Matrix = [] ;
Matrix = [[1, 1, 1, 1, 1]] ;
Matrix = [[0, 0, 0, 0, 0]] ;
Matrix = [[0, 0, 0, 0, 0], [1, 1, 1, 1, 1]].

Upvotes: 2

Related Questions