http://www.cse.ucsc.edu/~avg/ProofChecker/README
Track organizer: Allen Van Gelder, avg where? at cs.ucsc.edu
In the verified-unsat track of the 2011 Sat Solver Competition, solvers
produce some verification output and are only tested on unsatisfiable
benchmarks.
NEW THIS YEAR: We accept submission of QBF solvers that produce certificates
and we accept verification tools for both SAT and QBF.
This directory, http://www.cse.ucsc.edu/~avg/ProofChecker/ , is mostly
about the 2011 verified-unsat track. See the Documents/ subdirectory
for miscellaneous papers and talks related to proofs.
Because the verification track is very experimental, I will be more
flexible than the main tracks about the forms of submission.
In particular, submitters will have opportunities to submit scripts
to make their tools compatible with standard formats, and the "standard"
formats might vary, depending on the range of tools submitted.
Because of the experimental nature of this track, there are no prizes
and no official ranking system.
See
http://www.cse.ucsc.edu/~avg/ProofChecker/cert-poster-sat07.pdf
for the poster shown at SAT 2007 for that year's track.
We expect 2011 to be conducted similarly.
There are several accepted proof formats, each with precise specifications.
All formats may be found at http://www.cse.ucsc.edu/~avg/ProofChecker/
and are named as follows:
fileformat_qir.txt
fileformat_rup.txt
ProofChecker-fileformat.txt
These formats are designed to be neutral to the solver's methodology
as far as possible but RUP and QIR are most natural for CDCL-style solvers
that employ the grasp/chaff/minisat paradigm.
The new QIR format works for Q-resolution proofs in clausal QBF and resolution
proofs in SAT. The others are SAT only. The QIR format achieves the
twin goals of accepting proofs that are compact and having a proof
language that is recognizable in deterministic log space.
See StratExtrJan11.tar for a tool to extract a winning strategy from
a QIR refutation of an invalid QBF.
We expect the most popular format to be RUP, so that is described briefly
later in this page.
If you want to submit a solver that uses one of the 2005 formats,
in ProofChecker-fileformat.txt, that's OK too.
Please note that the 2005 format makes much longer files than the
RUP format. In C you might need to use cbigopen.{c,h} to create a file
more than 2 GB. These files are in RUPkit.tar.gz and are used by checker3.c.
--Allen Van Gelder
QUICK PRIMER ON RUP PROOFS
--------------------------
Download RUPkit.tar.gz (3785822 bytes) for tools to verify RUP proofs
generated by your SAT solver.
Or go to the directory RUPkit/ to look at what's in the tar file,
and download them individually or with wget.
NOTE: rupToRes_32_O3_3GB is apparently the same as rupToRes_32_O3 but
it will use up to 3 GB of memory, whereas rupToRes_32_O3 uses at most 1 GB.
Besides the programs, RUPkit/ contains several sample files, csh scripts,
and 3 test files, with outputs.
testeq.cnf is small enough to examine completely and check by hand.
Notice that the *.rup in this directory do NOT have the 256-byte header
specified in the fileformat_rup.txt document. rupToRes accepts this,
but it's a good idea to provide that header to avoid mixups.
Here is a command to make a newline terminated file exactly 256 bytes
(be careful to use single and double quotes as shown -- assumes csh or tcsh syntax):
cat hdr_any | awk '{f=f $0;}END{printf "%-255s\n",substr(f,1,255);}' > hdr256
This is all tested on Linux Red Hat, but rupToRes could have some bugs
in cases that did not arise during my tests.
I will try to help resolve any problems you encounter, if you are
participating in the verification track of SAT 2007.
See BUG.LIST in this directory for bugs and work-arounds.
The purpose of the RUP proof format is to allow a solver to "retro-fit"
some verification output without too much trouble, and to allow for
some possibility of derivation rules other than resolution.
The short story is that solvers in the vein of grasp, chaff, berkmin, minisat,
and picosat can simply output their conflict clauses (the final one being
the empty clause).
Also, we believe that look-ahead solvers will be able to output
a RUP proof without great difficulty. Notice that it is not necessary
to actually use all the clauses you output, but if the redundancy is too
great, you might hit filesize limits or time limits.
Practical experience is needed to see what can be achieved.
The proof should be on a separate file and should consist of one
256-byte header (255 bytes plus the newline), followed by derived clauses
in the usual DIMACS format.
Please visit the above URL for additional details and software
to check your verification output during development.
The rest of this page goes into details in case there are solvers
that use a different approach, or readers just want to see some examples.
The idea is based on
E.Goldberg, Y.Novikov.
Verification of proofs of unsatisfiability for CNF formulas.
Design, Automation and Test in Europe. March 2003, pp. 886--891.
http://eigold.tripod.com/papers.html.
The method of deriving the derived clause is not important.
It might be by resolution or some other means.
It is only necessary that the soundness of the derived clauses can be
verified by a procedure that we call REVERSE UNIT-PROPAGATION (RUP for
short).
In the discussion, clauses are delimited by square brackets.
Suppose that F is the input formula and D_1, ..., D_r are the
derived clauses that have been output, with D_r = [] (the empty clause).
The sequence D_1, ..., D_r is an RUP refutation of F if and only if:
For each i from 1 through r, steps 1--3 below derive []:
1. Negate D_i = [L_{ij}] producing one or more unit clauses, [-L_{ij}].
For example, the negation of [x, y, z] is [-x], [-y], [-z].
(When i = r, there are no unit clauses produced.)
2. Add these unit clauses [-L_{ij}] and D_1 through D_{i-1} to F.
(When i = r, there are no unit clauses to add.)
3. Perform unit-clause propagation.
EXAMPLES
Say the input file F has these clauses:
c example
p cnf 4 4
1 -4 -3 0
1 4 0
-1 0
-4 3 0
EXAMPLE 1
The RUP proof might be these integers (newlines are for readability):
4 3 0
0
This proof contains two derived clauses, [4 3] and [].
Negating [4 3] gives [-4] and [-3], which are added to F.
Then unit-clause propagation derives [].
Next, add [4 3] (the first derived clause) to F, and again
unit-clause propagation derives [].
So the proof is verified as sound, although we have no idea why the
program output [4 3] as a derived clause.
EXAMPLE 2
Another correct RUP proof for F is:
0
This is because the formula itself can be refuted by unit-clause propagation.
EXAMPLE 3
Another correct RUP proof for F is:
1 -3 0
1 3 0
-3 0
3 0
0
This is the sequence of resolvents in a resolution proof, but no
structure is given, so that fact might not be obvious.