Alexander Rossa
Alexander Rossa

Reputation: 2080

Predicate for removing certain terms from compound term in Prolog

I would like to write a Prolog predicate that takes in a compound term as its argument and outputs this compound term with some of the nested terms removed. For example, let's say that I have a compound term:

outer_term(level_one(level_two_a(X), level_two_b(Y)), level_one(level_two_b(Z))).

And I would like to write a predicate extract_terms/2 which would take this term and returned it without occurences of level_two_a/1.

extract_terms(Term, ExtractedTerm) :-
       *** Prolog Magic ***.

Is there a built-in (or semi-built-in) way to do this in Prolog? If not, how would I go about doing this? One way that occurs to me would be to use =../2 operator to convert the Term into a list and then somehow use some built-in predicate like subtract/3 to get rid of the predicates that I want. The trouble I have is making this work with list that has nested terms as its items.

I would appreciate any ideas, thank you.

Upvotes: 3

Views: 369

Answers (1)

mat
mat

Reputation: 40768

First, a general guideline:

Everything that can be expressed by pattern matching should be expressed by pattern matching.

You have asked a similar question previously, although it was a bit simpler. Still, let us consider the simpler case first: You said a possible instance of a verb phrase would be:

VP = vp(vp(verb(making), adj(quick), np2(noun(improvements))))

and you want to extract the verb. Well, the simplest approach is to use pattern matching, or more generally, unification, like this:

?- VP = vp(vp(verb(making), adj(quick), np2(noun(improvements)))),
   VP = vp(vp(Verb, _, _)).

This yields:

Verb = verb(making).

Thus, we have successfully "extracted" verb(making) from such a phrase by virtue of unification.

Now to the slightly more complex task you are considering in this question: At this point, you may wonder whether you have chosen a good representation of your data. Frequent use or even the very necessity of (=..)/2 typically indicates a problem with your representation, since it may mean that you have lost track or control of the possible shapes of your data.

In this concrete case, you state as an example:

outer_term(level_one(level_two_a(X), level_two_b(Y)), level_one(level_two_b(Z))).

and you want to remove occurrences of level_two_a. You can now of course begin to mess with (=..)/2, which requires a conversion of such terms to lists, then some reasoning on these lists, and a second conversion from lists back to such structures. That's not how we want to work with our data. In addition to other drawbacks, it would preclude more general usage patterns that we expect from relations.

Instead, let us fix the data representation so that we can cleanly distinguish the different cases. For example, instead of "hardcoding" the very parameter we need to distinguish the cases inside of a functor, let us make the distinction explicit: We want to be able to distinguish, by pattern matching, level 1 from level 2.

So, the following representation suggests itself:

outer_term([level(1, [level(2, X),
                      level(2, Y)]),
            level(1, [level(2, Z)])]).

This may need some additional attributes, such as a and b, and I leave extending this representation to represent such attributes as an easy exercise. The general idea should be clear though: We have thus achieved a uniform representation about which we can easily reason symbolically.

It is now easy to describe the relation between a (potentially nested) list of such levels and the levels without the "level 2" elements:

without_level_2(Ls0, Ls) :-
        phrase(no_level_2(Ls0), Ls).

no_level_2([]) --> [].
no_level_2([L|Ls]) -->
        no_level_2_(L),
        no_level_2(Ls).

no_level_2_(level(2,_))   --> [].
no_level_2_(level(L,Ls0)) --> [level(L,Ls)],
        { dif(L, 2),
          without_level_2(Ls0, Ls) }.

See for more information about this formalism.

Sample query:

?- outer_term(Ts0),
   without_level_2(Ts0, Ts).

Yielding:

Ts = [level(1, []), level(1, [])] .

Note that to truly benefit from this representation, you need to obtain it in the first place by using or generating terms of such shapes. Once you have this ensured, you can conveniently stick to pattern matching to distinguish the cases. Among the main benefits of this approach we find convenience, performance and generality. For example, we can use the DCG shown above not only to extract but also to generate terms of this form:

?- length(Ls0, _), without_level_2(Ls0, Ls).
Ls0 = Ls, Ls = [] ;
Ls0 = [level(2, _56)],
Ls = [] ;
Ls0 = Ls, Ls = [level(_130, [])],
dif(_130, 2) ;
Ls0 = [level(_150, [level(2, _164)])],
Ls = [level(_150, [])],
dif(_150, 2) .

This is truly a relation, usable in all directions. For this reason, I have avoided imperative names like "extract", "remove" etc. in the predicate name, because these always suggest a particular direction of use, not doing justice to the generality of the predicate.

Upvotes: 4

Related Questions