Reputation: 3188
I understand that Prolog programmers typically use DCGs instead of regular expressions for matching patterns in a string. In Perl, one might write
if ( '... accd' =~ /a+b*c{2,4}d$/ ) {
say "matched";
}
How would one match the same pattern in Prolog?
Upvotes: 3
Views: 732
Reputation: 60034
I've extended this answer
:- op(100, xf, *).
:- op(100, xf, +).
rexp(C) --> [C].
rexp([T|Ts]) --> rexp(T), rexp(Ts).
rexp([]) --> [].
rexp(eps) --> [].
rexp(_*) --> [].
rexp(R*) --> rexp(R), rexp(R*).
rexp(R+) --> rexp(R), rexp(R*).
rexp((R1|R2)) --> ( rexp(R1) ; rexp(R2) ).
rexp(range(R,N,M)) -->
{between(N,M,L),
length(D,L),
maplist(copy_term(R),D)
}, rexp(D).
then your regular expression match could be
?- phrase(rexp([a+, b*, range(c,2,4), d]), [a,c,c,d]),
writeln(matched).
Note that in this way we match atoms instead of single characters.
edit after false' comment, I think that the first clause should read
rexp(C) --> {atomic(C)}, [C].
to avoid for instance
?- phrase(rexp([a+]), [a+]).
true ;
Indeed, after the correction we have the expected outcome:
?- phrase(rexp([a+]), [a+]).
false.
done edit
Instead of interpreting regular expressions the pattern could be 'hardcoded' (much easier)
% I prefer the equivalent clause below
% p1 --> "a", p1 ; "a", p2.
p1 --> "a", (p1 ; p2).
p2 --> "b", p2 ; p3.
p3 --> ("cc" ; "ccc" ; "cccc"), "d".
then
?- phrase(p1, "accd").
true
here we match single characters (a string in Prolog it's a list of character codes)
Upvotes: 1