http://www.cse.ucsc.edu/~avg/ProofChecker/README Track organizer: Allen Van Gelder, avg where? at cs.ucsc.edu In the verified-unsat track of the 2011 Sat Solver Competition, solvers produce some verification output and are only tested on unsatisfiable benchmarks. NEW THIS YEAR: We accept submission of QBF solvers that produce certificates and we accept verification tools for both SAT and QBF. This directory, http://www.cse.ucsc.edu/~avg/ProofChecker/ , is mostly about the 2011 verified-unsat track. See the Documents/ subdirectory for miscellaneous papers and talks related to proofs. Because the verification track is very experimental, I will be more flexible than the main tracks about the forms of submission. In particular, submitters will have opportunities to submit scripts to make their tools compatible with standard formats, and the "standard" formats might vary, depending on the range of tools submitted. Because of the experimental nature of this track, there are no prizes and no official ranking system. See http://www.cse.ucsc.edu/~avg/ProofChecker/cert-poster-sat07.pdf for the poster shown at SAT 2007 for that year's track. We expect 2011 to be conducted similarly. There are several accepted proof formats, each with precise specifications. All formats may be found at http://www.cse.ucsc.edu/~avg/ProofChecker/ and are named as follows: fileformat_qir.txt fileformat_rup.txt ProofChecker-fileformat.txt These formats are designed to be neutral to the solver's methodology as far as possible but RUP and QIR are most natural for CDCL-style solvers that employ the grasp/chaff/minisat paradigm. The new QIR format works for Q-resolution proofs in clausal QBF and resolution proofs in SAT. The others are SAT only. The QIR format achieves the twin goals of accepting proofs that are compact and having a proof language that is recognizable in deterministic log space. See StratExtrJan11.tar for a tool to extract a winning strategy from a QIR refutation of an invalid QBF. We expect the most popular format to be RUP, so that is described briefly later in this page. If you want to submit a solver that uses one of the 2005 formats, in ProofChecker-fileformat.txt, that's OK too. Please note that the 2005 format makes much longer files than the RUP format. In C you might need to use cbigopen.{c,h} to create a file more than 2 GB. These files are in RUPkit.tar.gz and are used by checker3.c. --Allen Van Gelder QUICK PRIMER ON RUP PROOFS -------------------------- Download RUPkit.tar.gz (3785822 bytes) for tools to verify RUP proofs generated by your SAT solver. Or go to the directory RUPkit/ to look at what's in the tar file, and download them individually or with wget. NOTE: rupToRes_32_O3_3GB is apparently the same as rupToRes_32_O3 but it will use up to 3 GB of memory, whereas rupToRes_32_O3 uses at most 1 GB. Besides the programs, RUPkit/ contains several sample files, csh scripts, and 3 test files, with outputs. testeq.cnf is small enough to examine completely and check by hand. Notice that the *.rup in this directory do NOT have the 256-byte header specified in the fileformat_rup.txt document. rupToRes accepts this, but it's a good idea to provide that header to avoid mixups. Here is a command to make a newline terminated file exactly 256 bytes (be careful to use single and double quotes as shown -- assumes csh or tcsh syntax): cat hdr_any | awk '{f=f $0;}END{printf "%-255s\n",substr(f,1,255);}' > hdr256 This is all tested on Linux Red Hat, but rupToRes could have some bugs in cases that did not arise during my tests. I will try to help resolve any problems you encounter, if you are participating in the verification track of SAT 2007. See BUG.LIST in this directory for bugs and work-arounds. The purpose of the RUP proof format is to allow a solver to "retro-fit" some verification output without too much trouble, and to allow for some possibility of derivation rules other than resolution. The short story is that solvers in the vein of grasp, chaff, berkmin, minisat, and picosat can simply output their conflict clauses (the final one being the empty clause). Also, we believe that look-ahead solvers will be able to output a RUP proof without great difficulty. Notice that it is not necessary to actually use all the clauses you output, but if the redundancy is too great, you might hit filesize limits or time limits. Practical experience is needed to see what can be achieved. The proof should be on a separate file and should consist of one 256-byte header (255 bytes plus the newline), followed by derived clauses in the usual DIMACS format. Please visit the above URL for additional details and software to check your verification output during development. The rest of this page goes into details in case there are solvers that use a different approach, or readers just want to see some examples. The idea is based on E.Goldberg, Y.Novikov. Verification of proofs of unsatisfiability for CNF formulas. Design, Automation and Test in Europe. March 2003, pp. 886--891. http://eigold.tripod.com/papers.html. The method of deriving the derived clause is not important. It might be by resolution or some other means. It is only necessary that the soundness of the derived clauses can be verified by a procedure that we call REVERSE UNIT-PROPAGATION (RUP for short). In the discussion, clauses are delimited by square brackets. Suppose that F is the input formula and D_1, ..., D_r are the derived clauses that have been output, with D_r = [] (the empty clause). The sequence D_1, ..., D_r is an RUP refutation of F if and only if: For each i from 1 through r, steps 1--3 below derive []: 1. Negate D_i = [L_{ij}] producing one or more unit clauses, [-L_{ij}]. For example, the negation of [x, y, z] is [-x], [-y], [-z]. (When i = r, there are no unit clauses produced.) 2. Add these unit clauses [-L_{ij}] and D_1 through D_{i-1} to F. (When i = r, there are no unit clauses to add.) 3. Perform unit-clause propagation. EXAMPLES Say the input file F has these clauses: c example p cnf 4 4 1 -4 -3 0 1 4 0 -1 0 -4 3 0 EXAMPLE 1 The RUP proof might be these integers (newlines are for readability): 4 3 0 0 This proof contains two derived clauses, [4 3] and []. Negating [4 3] gives [-4] and [-3], which are added to F. Then unit-clause propagation derives []. Next, add [4 3] (the first derived clause) to F, and again unit-clause propagation derives []. So the proof is verified as sound, although we have no idea why the program output [4 3] as a derived clause. EXAMPLE 2 Another correct RUP proof for F is: 0 This is because the formula itself can be refuted by unit-clause propagation. EXAMPLE 3 Another correct RUP proof for F is: 1 -3 0 1 3 0 -3 0 3 0 0 This is the sequence of resolvents in a resolution proof, but no structure is given, so that fact might not be obvious.