AndreaNobili
AndreaNobili

Reputation: 43047

Recursive similarity between trees according to the number of common subtrees in Prolog

I am stufying Prolog using SWI Prolog and I am finding many difficult with this code snippet that found if 2 binary trees have N common subtree (having the same root):

/* BASE CASE: T1 and T2 are two tree composed of only the root X so it is
             TRUE that they have exactly one common subtree
*/
delta(tree(X,[]),tree(X,[]),1):- !.

/* RULE: T1 and T2 are DIFFERENT and are structured tree.
         It is TRUE that T1 and T2 have N subtrees if it is TRUE that:
*/
delta(tree(X,RX),tree(X,RX1),N):- sons(RX,SX), 
                                  sons(RX1,SX)
                              subdelta(RX,RX1,N), 
                                  !.
/* Else whatever T1 and T2 it is true that they have 0 common tree
   (here we are in the case that the previous delta\2 predicate are 
   boot failed) 
*/
delta(_,_,0):- !.

subdelta([],[],1).

subdelta([T1|R1],[T2|R2], N):-
    delta(T1,T2,N1),
    subdelta(R1,R2, NR),
    N is (1 + N1)*NR.

I think that the delta/3 predicate it is true if the first tree have N common subtrees with the second tree

He rappresent the tree in this way:

tree(F, LIST_OF_SUBTREES).

So for example this is a tree composed by a root X and two leaves u and v:

tree(x, [tree(u,[]), tree(v,[])])

I think that the delta/3 predicate it is declined into the 3 possible cases:

1) T1 and T2 are two tree composed of only the root X so it is TRUE that they have exactly one common subtree

**2) T1 and T2 are DIFFERENT and are structured tree that have more levels so it is TRUE that T1 and T2 have N subtrees if it is TRUE that: ?!?!

3) Else, if both the prevuous delta\2 predicate are failed, whatever T1 and T2 it is true that they have 0 common tree

I think that this interpretation is correct (I hope so...) but I have some difficulties to understand the second rule: what could be sons/2 predicate (it seems to me that this is note a SWI Prolog built in predicate and I have no its implementation on the slide where I am studyin)

What is for you? And what is it's logic?

Tnx

Andrea

Upvotes: 1

Views: 264

Answers (1)

Daniel Lyons
Daniel Lyons

Reputation: 22803

Your interpretation of the three rules seems reasonable to me. For comparison, I would rephrase them as:

  1. delta(T1, T2, 1) holds if T1 and T2 have the same value and are empty (leaf nodes).
  2. delta(T1, T2, N) holds if T1 and T2 have N common subtrees.
  3. delta(T1, T2, 0) holds if T1 and T2 have 0 common subtrees.

I'm unclear on why these cuts are necessary. I suppose they're green cuts, because a pair of trees can't have 1, N and 0 common subtrees simultaneously.

sons/2 is interesting. I could imagine it working a few different ways. One thing we know for sure is that if two trees have common subtrees, sons/2 should generate the same value; it must work that way because otherwise sons(RX, SX), sons(RX1, SX) wouldn't ever work. (Note that there's a missing comma on that line).

One question that remains is, does sons/2 work by generating all the subtrees, or just the nearest pair? It seems likely to me that it generates the nearest pair only, because subdelta/3 calls delta/3, leading to indirect recursion. If sons/2 generated all subtrees, this would result in unbounded recursion or at least a lot of unnecessary work. So I would bet that sons/2 looks like this:

sons(tree(_,Children), X) :- member(X, Children).

This suggests at least one case where delta/3 is going to do something more intelligent than one would expect at first blush: the case where T1 and T2 are reflections of each other. sons/2 of T1 and T2 will unify the left with the right and then the right with the left, so you'll get the maximal similarity from sharing subtrees but not exact structure.

What's most surprising to me about this is that delta/3 doesn't seem to be counting differences, it seems to be counting similarities. This isn't what one would expect from the name, but it follows from the implementation. And it's hard to imagine how one would count differences directly--what is the upper limit? With files, for instance, you can count the differences by saying, each line could be the same (0) or different (1) and then adding up the differences.

Hope this is on the right track and helps!

Upvotes: 1

Related Questions