Reputation: 589
I am testing the inductive capabilities of some theorem provers (such as Z3, Alt-Ergo, Vampire etc.) using the TPTP syntax. To my surprise, none of them managed to demonstrate the following simple conjecture:
tff(t1, type, (fun: $int > $int )).
tff(ax1, axiom, (
! [A: $int] : (
$less(A, 1) => (fun(A) = 123)
)
)).
tff(ax2, axiom, (
! [A: $int] : (
$greatereq(A, 1) => (fun(A) = fun($difference(A, 1)))
)
)).
tff(conj1, conjecture, ! [A: $int] : ($greatereq(A, 1) => (fun(A) = 123))).
% END OF SYSTEM OUTPUT
% RESULT: SOT_EWCr1V - Z3---4.8.9.0 says Timeout - CPU = 60.09 WC = 35.47
% OUTPUT: SOT_EWCr1V - Z3---4.8.9.0 says None - CPU = 0 WC = 0
This conjecture can be easily proven by induction, however it does not seem to be the case for the vast majority of theorem provers I have tested this on. Obviously, if I restrict the domain to only one element instead of the whole set of integers, the ATP succeeds because it only needs to check against a limited set of numbers:
tff(t1, type, (fun: $int > $int )).
tff(ax1, axiom, (
! [A: $int] : (
$less(A, 1) => (fun(A) = 123)
)
)).
tff(ax2, axiom, (
! [A: $int] : (
$greatereq(A, 1) => (fun(A) = fun($difference(A, 1)))
)
)).
tff(conj1, conjecture, ! [A: $int] : ((A = 5) => (fun(A) = 123))).
% END OF SYSTEM OUTPUT
% RESULT: SOT_j9liHr - Z3---4.8.9.0 says Theorem - CPU = 0.00 WC = 0.08
% OUTPUT: SOT_j9liHr - Z3---4.8.9.0 says Proof - CPU = 0.00 WC = 0.09
Is this a general limitation of automatic theorem provers? Is there any tool that performs well with induction?
PS: You can try it out using the following online tool: https://tptp.org/cgi-bin/SystemOnTPTP
PS2: The TPTP syntax manual can be found here: https://www.tptp.org/TPTP/TR/TPTPTR.shtml
Upvotes: 3
Views: 792
Reputation: 93
As pointed out in a comment above, Vampire does support induction. However, as is common with other theorem provers, to get Vampire to do what you want is sometimes a little tricky. In this case, to get it to use induction, you have to run with options
--mode portfolio --schedule induction
With those options set, Vampire finds a proof of the above in next to no time (0.04 seconds on my machine)
The TPTP website does not let you set specific options when running provers, so if you want to try out the above you will have to grab a Vampire release from here or build from source.
Upvotes: 8
Reputation: 30418
This is expected. SMT solvers don't do induction out-of-the-box. You can "coax" them to do induction by proving the base case, then posing the induction-hypothesis and have them prove it using quantifiers; but that's usually a fool's errand. SMT solvers are simply not the right choice for doing inductive proofs. Here're some relevant prior discussions on stackoverflow regarding this matter:
And many others.
Having said that, the new define-fun-rec
construct in SMTLib allows for recursive definitions, whose proofs are naturally done by induction. So, I expect the community will be going towards that direction; adding inductive capabilities over time. For instance, see:
for a paper on how to do this properly in an SMT solver. As far as I know CVC5 has incorporated some of these ideas, but expecting out-of-the-box inductive proofs would be naive at this point. (See https://github.com/cvc5/cvc5/issues/1796)
So, long story short: No, SMT solvers don't do induction. You can coax them to do it, and there's recent work that add further capabilities, but a push-button experience is unlikely. If your goal is to reason about recursive definitions and hence you rely on induction, your best bet is to use a semi-automated theorem prover such as Isabelle, Coq, HOL, HOL-Light, ACLL2, Lean, etc.. all of which have strong facilities to do induction. Furthermore they incorporate SMT solvers as "tactics" so you get the best of both worlds in that sense. (i.e., use a manual strategy to dissect your proof to simple enough subgoals, handle induction etc, and ship the rest off to an SMT-solver.)
Upvotes: 1