Reputation: 75
I am trying to prove that particular implementations of how to calculate the edit distance between two strings are correct and yield identical results. I went with the most natural way to define edit distance recursively as a single function (see below). This caused coq to complain that it couldn't determine the decreasing argument. After some searching, it seems that using the Program Fixpoint mechanism and providing a measure function is one way around this problem. However, this led to the next problem that the tactic simpl no longer works as expected. I found this question which has a similar problem, but I am getting stuck because I don't understand the role the Fix_sub function is playing in the code generated by coq for my edit distance function which looks more complicated than in the simple example in the previous question.
Questions:
Is there an easier way using Program Fixpoint to get reductions?
Fixpoint min_helper (best :nat) (l : list nat) : nat :=
match l with
| nil => best
| h::t => if h<?best then min_helper h t else min_helper best t
end.
Program Fixpoint edit_distance (s1 s2 : string) {measure (length s1+ length s2)} : nat :=
match s1, s2 with
| EmptyString , EmptyString => O
| String char rest , EmptyString => length s1
| EmptyString , String char rest => length s2
| String char1 rest1 , String char2 rest2 =>
let choices : list nat := S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil in
if (Ascii.eqb char1 char2)
then min_helper (edit_distance rest1 rest2 ) choices
else min_helper (S (edit_distance rest1 rest2)) choices
end.
Next Obligation.
intros. simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. reflexivity. Qed.
Next Obligation.
simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.
Next Obligation.
simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.
Theorem simpl_edit : forall (s1 s2: string), edit_distance s1 s2 = match s1, s2 with
| EmptyString , EmptyString => O
| String char rest , EmptyString => length s1
| EmptyString , String char rest => length s2
| String char1 rest1 , String char2 rest2 =>
let choices : list nat := S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil in
if (Ascii.eqb char1 char2)
then min_helper (edit_distance rest1 rest2 ) choices
else min_helper (S (edit_distance rest1 rest2)) choices
end.
Proof. intros. induction s1.
- induction s2.
-- reflexivity.
-- reflexivity.
- induction s2.
-- reflexivity.
-- remember (a =? a0)%char as test. destruct test.
--- (*Stuck??? Normally I would unfold edit_distance but the definition coq creates after unfold edit_distance ; unfold edit_distance_func is hard for me to reason about*)
Upvotes: 1
Views: 524
Reputation: 29193
You can instead use Function
, which comes with Coq and produces a reduction lemma for you (this will actually also generate a graph as Inductive R_edit_distance
in the vein of the alternative development you mention, but here it's quite gnarly—that might be because of my edits for concision)
Require Import String.
Require Import List.
Require Import PeanoNat.
Import ListNotations.
Require Import FunInd.
Require Recdef.
Fixpoint min_helper (best : nat) (l : list nat) : nat :=
match l with
| [] => best
| h :: t => if h <? best then min_helper h t else min_helper best t
end.
Function edit_distance
(ss : string * string) (* unfortunately, Function only supports one decreasing argument *)
{measure (fun '(s1, s2) => String.length s1 + String.length s2) ss} : nat :=
match ss with
| (String char1 rest1 as s1, String char2 rest2 as s2) =>
let choices := [S (edit_distance (rest1, s2)); S (edit_distance (s1, rest2))] in
if Ascii.eqb char1 char2
then min_helper (edit_distance (rest1, rest2)) choices
else min_helper (S (edit_distance (rest1, rest2))) choices
| (EmptyString, s) | (s, EmptyString) => String.length s
end.
all: intros; simpl; rewrite Nat.add_succ_r; repeat constructor.
Qed.
Check edit_distance_equation. (* : forall ss : string * string, edit_distance ss = ... *)
Print R_edit_distance. (* Inductive R_edit_distance : string * string -> nat -> Prop := ... *)
The reason the graph Inductive
definition (either then nice one you cited or the messy one generated here) doesn't require assurances of termination is that terms of Inductive
type have to be finite already. A term of R_edit_distance ss n
(which represents edit_distance ss = n
) should be seen as a record or log of the steps in the computation of edit_distance
. Though a generally recursive function could possibly get stuck in an infinite computation, the corresponding Inductive
type excludes that infinite log: if edit_distance ss
were to diverge, R_edit_distance ss n
would simply be uninhabited for all n
, so nothing blows up. In turn, you don't have the ability to actually compute, given ss
, what edit_distance ss
is or a term in {n | R_edit_distance ss n}
, until you complete some termination proof. (E.g. proving forall ss, {n | R_edit_distance ss n}
is a form of termination proof for edit_distance
.)
Your idea to use structural recursion over some auxiliary type is exactly right (that's the only form of recursion that is available anyway; both Program
and Function
just build on it), but it doesn't really have anything to do with the graph inductive...
Fixpoint edit_distance
(s1 s2 : string) (n : nat) (prf : n = String.length s1 + String.length s2)
{struct n}
: nat := _.
Something along the lines of the above should work, but it'll be messy. (You could recurse over an instance of the graph inductive instead of the nat
here, but, again, that just kicks the bucket to building instances of the graph inductive.)
Upvotes: 2
Reputation: 33389
There is a common trick to this kind of recursion over two arguments, which is to write two nested functions, each recursing over one of the two arguments.
This can also be understood from the perspective of dynamic programming, where the edit distance is computed by traversing a matrix. More generally, the edit distance function edit xs ys
can be viewed as a matrix of nat
with rows indexed by xs
and columns indexed by ys
. The outer recursion iterates over rows xs
, and for each of those rows, when xs = x :: xs'
, the inner recursion iterates over its columns ys
to generates the entries of that row from another row with a smaller index xs'
.
From Coq Require Import String Ascii.
Local Open Scope string_scope.
Infix "::" := String.
(* structural recursion on xs *)
Fixpoint edit (xs ys : string) : nat :=
match xs with
| "" => String.length ys
| x :: xs' =>
(* structural recursion on ys *)
let fix edit_xs ys :=
match ys with
| "" => String.length xs
| y :: ys' =>
let orth := min (edit xs' ys) (edit_xs ys') in
if (x =? y)%char then
min (edit xs' ys') orth
else
orth
end in
edit_xs ys
end.
Upvotes: 1