Reputation: 528
The intention of the functions below is to find matching pairs of length 2 subsequences. E.g. [1,2,3,4,2,3,5] has a matching pair of [2,3]. If I feed a sequence w/out a matching pair the first function gets a memory overflow but the second finishes quickly.
Why the difference and is there a general technique for how to dig into this sort of issue?
repeat_pair(Items) :-
append(_,[X,Y|_],Left),
append(Left,[X,Y|_],Items).
repeat_pair(Items) :-
append(Left,[X,Y|_],Items),
append(_,[X,Y|_],Left).
Upvotes: 4
Views: 154
Reputation: 10102
Why the difference ... ?
It suffices to consider the following failure-slice of your first version:
repeat_pair(Items) :- append(_,[X,Y|_],Left), false,append(Left,[X,Y|_],Items).
This fragment shares some properties with the original program: In particular, if this fragment loops, then your original program will loop, too. Now, what might influence termination? Only the variable Items
might influence it, but all occurrences are behind the false
. So there is no way how Items
influences termination and the goal append(_,[X,Y|_],Left)
does not terminate. Thus, even if this program succeeds, it will not terminate. Think of repeat_pair("abab").
... and is there a general technique for how to dig into this sort of issue?
Yes, there is. Question answered.
(It's above technique called failure-slice. See the tag for more examples.)
Finally, here is a version using dcgs:
repeat_pair(Cs) :-
phrase(( ..., [A,B], ..., [A,B], ... ), Cs).
... --> [] | [_], ...
A generalization which also accepts non-lists like [a,b,a,b|non_list]
is below. Often, such generalizations are preferred, as they are slightly shorter and slightly more efficient and may possess slightly better termination properties (not in this case).
repeat_pair(Cs) :-
phrase(( ..., [A,B], ..., [A,B] ), Cs, _).
Upvotes: 2
Reputation: 60004
Prolog operational semantic is based on chronologic backtracking, and this means that goals order is important for correctness, as it provides instantiation of variables in a predictable way. That said, you could use append/2 instead of append/3 to achieve the same result:
repeat_pair(Items) :- append([_,[X,Y],_,[X,Y],_],Items).
effectively moving into a library predicate the logic.
Upvotes: 2
Reputation: 9378
As you observe in your comment, in the first variant there is no relation between Items
and the goal append(_, [X,Y|_], Left)
, so on backtracking this goal will enumerate all its infinitely many solutions.
Your second approach is the best one for this problem, but indeed in general you sometimes need to restrict the search space in other ways.
The problem with the first clause is that Left
is completely unconstrained. But we know that we never want it to be longer than Items
. So we can write a predicate to express "Items
is a list, and Left
is a shorter or equal length list":
list_shorterorequal(_List, []).
list_shorterorequal([_|List], [_|ShorterOrEqual]) :-
list_shorterorequal(List, ShorterOrEqual).
?- list_shorterorequal([a, b, c], ShorterOrEqual).
ShorterOrEqual = [] ;
ShorterOrEqual = [_G938] ;
ShorterOrEqual = [_G938, _G941] ;
ShorterOrEqual = [_G938, _G941, _G944].
And then you can adapt your first implementation like this:
repeat_pair_1x(Items) :-
list_shorterorequal(Items, Left),
append(_,[X,Y|_],Left),
append(Left,[X,Y|_],Items).
And this succeeds once, then terminates quickly:
?- repeat_pair_1x([1,2,3,4,2,3,5]).
true ;
false.
If your Prolog has a between/3
predicate (they usually do), you could also use that to restrict the length of Left
before we know anything else about Left
itself:
repeat_pair_2x(Items) :-
length(Items, ItemsLength),
between(2, ItemsLength, LeftLength),
length(Left, LeftLength),
append(_,[X,Y|_],Left),
append(Left,[X,Y|_],Items).
This also terminates nicely:
?- repeat_pair_2x([1,2,3,4,2,3,5]).
true ;
false.
Note that these two predicates behave slightly differently on the most general query, and in general if the argument of the call is not bound to a finite list:
?- repeat_pair_1x(Items).
Items = [_G918, _G924, _G918, _G924|_G940] ;
Items = [_G918, _G924, _G930, _G918, _G924|_G946] ;
Items = [_G918, _G924, _G930, _G924, _G930|_G949] ;
Items = [_G918, _G924, _G930, _G936, _G918, _G924|_G952] ;
Items = [_G918, _G924, _G930, _G936, _G924, _G930|_G955] .
?- repeat_pair_2x(Items).
Items = [_G918, _G921, _G918, _G921] ;
Items = [_G918, _G921, _G918, _G921, _G930] ;
Items = [_G918, _G921, _G924, _G918, _G921] ;
Items = [_G918, _G921, _G924, _G921, _G924] ;
Items = [_G918, _G921, _G918, _G921, _G930, _G933] .
By calling length/2
on Items
in the second variant, we force it to be a finite list. There might be cases where you don't want this.
Upvotes: 2