, 7 min read
Benchmarking theorem provers for programming tasks: yices vs. z3
9 thoughts on “Benchmarking theorem provers for programming tasks: yices vs. z3”
, 7 min read
9 thoughts on “Benchmarking theorem provers for programming tasks: yices vs. z3”
For bitvector theory (aka machine integers), another solver that gives great results is boolector. It might be better than yices. Z3 is especially good as the swiss-army knife of SMT solvers (aka, I want to be able to do everything)/
I am going with Geoff’s advice who found boolector to be slower than yices.
Did you try benchmarking boolector?
I haven’t come around to benchmarking boolector on your example. If I have the time by the end of the week, I’ll report in this thread.
However, in the annual competition of the SMT community (see this year’s results at https://smt-comp.github.io/2020/results.html, where I learned about Bitwuzla), boolector has regularly been at the top in the “QF-ABV (this is the category of formulas you are interested in) in the last few years.
Yices is usually not very far behind and, in practice, our research group has used both since they have slightly different heuristics.
One good way to have SMT solvers go at your problem is to make them available to SMT-LIB, the library of problems used to evaluate the solvers, in particular during SMT-COMP.
So, I’ve taken some (procrastination) time to run the benchmark with boolector and cvc4 as well.
The key take away is that yices is indeed clearly better for this problem.
The surprise to me is that z3 beats boolector in single query mode and that boolector is super slow in incremental mode (which is the mode you seem to be using through the API AFAIK).
The repo with the details is here
https://github.com/rbonichon/smt-inverse-benchmark
I had never played with solvers before. I always wrote my own program for these types of problems. Guessing that even the “fast” solver is still much slower than I wrote a crude equivalent in C. Running on a google cloud shell console server, it finds the inverses of odd integers up to 9999 in 0.09 seconds.
https://gist.github.com/nerdralph/aba30d4d1145bcc9129f1b97604dc480
In most cases, you can design a specific solver by hand that is much faster. However, you are trading your own engineering time against CPU cycles.
The difference is about 2x, and not 15x if you set the proper mode for z3 (although I agree that this should be documented better and/or it should be smarter about figuring out the required mode).
Use the following diff:
s = SolverFor(“QF_BV”)The difference is about 2x, and not 15x if you set the proper mode for z3 (although I agree that this should be documented better and/or it should be smarter about figuring out the required mode). Use the following:
s = SolverFor(“QF_BV”)
instead of
s = Solver()
I think it is not fair to call SMT solvers “theorem provers”. Yes, they are very useful, they can automate certain (sub)tasks, especially in real theorem provers, but they have some fundamental limitations. No SMT can deal on it’s own with problems like verification of floating point algorithms or floating point units. Much like bounded model checkers, great for bug hunting in control path, but totally inadequate for datapaths.