http://www.cse.ucsc.edu/~avg/ProofChecker/fileformat_rup.txt
VERSION 1.3 Jan 31, 2009.
AUTHOR Allen Van Gelder, avg where? at cs.ucsc.edu
COPYRIGHT Allen Van Gelder, Jan, 2009. All rights reserved. Permission is
granted to reproduce this document for noncommercial research purposes.
1. SPECIFICATION FOR REVERSE UNIT-PROPAGATION PROOF FILES
---------------------------------------------------
The file format for a non-resolution proof is the following, where
bytes are numbered from 0, and double quotes indicate character sequences
that are NOT null-terminated:
0-3 "%RUP"
4-7 format specs, terminated by white space.
The following format will be accepted for the SAT09 competition:
"D32 " denotes ASCII signed integers separated by white
space; e.g. use printf format "%ld\n".
8-31 two unsigned ASCII integers, n and m, representing the
number of variables and number of clauses in input formula,
each followed by white space (must match p-line of input
formula, n might only be an upper bound on the number of
variables); e.g., use printf format "%11d %11d\n".
32-255 identification, comments, etc., ASCII. It is recommended
that this section be newline-terminated, for readability.
256-end RUP (reverse unit propagation) proof, ASCII. This section consists
of a sequence of unsigned and signed integers in ASCII,
representing literals. The sequence is described in detail in
the next section.
1.1. DIMACS-LIKE REVERSE UNIT-PROPAGATION PROOF FORMAT
The proof is a sequence of integers that represents the concatenation
of subsequences called DERIVED CLAUSES, defined below.
Let n and m denote the number of variables and number of clauses,
as given on the p-line of the input formula.
A LITERAL is a positive integer in the range 1-n or the negation of
such an integer.
A DERIVED CLAUSE in the Dimacs-like format D32 is a subsequence consisting
of k literals, followed by the unsigned integer 0.
The value of k may be 0, denoting the empty clause.
The literals may include duplicates, but NOT contradictory pairs.
In all ASCII formats, integers are separated arbitrarily by white space
consisting of one or more of these characters: space, tab, or newline.
The usual convention is that a newline follows the clause-terminating 0,
but this is not a firm requirement. However, there must be a newline
at the end of the proof.
A common convention is that each clause occupies exactly one line, but
this is not even a loose requirement. The 0 marks the end of a clause.
Using " %ld" (note the space before the %) with fscanf permits reading
one integer without worrying about newlines; check the return code
to be sure an item was read.
1.2 LOGICAL CONTENT OF DERIVED CLAUSES
The short story is that solvers in the vein of grasp, chaff, berkmin, minisat
can simply output their conflict clauses (the final one being the empty
clause). The rest of this section goes into details in case there are
solvers that use a different approach.
The idea is based on
E.Goldberg, Y.Novikov.
Verification of proofs of unsatisfiability for CNF formulas.
Design, Automation and Test in Europe. March 2003, pp. 886--891.
http://eigold.tripod.com/papers.html.
The method of deriving the derived clause is not important.
It might be by resolution or some other means.
It is only necessary that the soundness of the derived clauses can be
verified by a procedure that we call REVERSE UNIT-PROPAGATION (RUP for
short).
In the discussion, clauses are delimited by square brackets.
Suppose that F is the input formula and D_1, ..., D_r are the
derived clauses that have been output, with D_r = [] (the empty clause).
The sequence D_1, ..., D_r is an RUP refutation of F if and only if:
For each i from 1 through r, this procedure derives []:
1. Negate D_i = [L_{ij}] producing one or more unit clauses, [-L_{ij}].
For example, the negation of [x, y, z] is [-x], [-y], [-z].
(When i = r, there are no unit clauses produced.)
2. Add these unit clauses [-L_{ij}] and D_1 through D_{i-1} to F.
(When i = r, there are no unit clauses to add.)
3. Perform unit-clause propagation.
1.2.1 EXAMPLES
EXAMPLE 1
Say the input file has these clauses:
c example 1
p cnf 4 4
1 -4 -3 0
1 4 0
-1 0
-4 3 0
The proof might be these integers (newlines are for readability):
4 3 0
0
This proof contains two derived clauses, [4 3] and [].
Negating [4 3] gives [-4] and [-3], which are added to F.
Then unit-clause propagation derives [].
Next, add [4 3] (the first derived clause) to F, and again
unit-clause propagation derives [].
So the proof is verified as sound, although we have no idea why the
program output [4 3] as a derived clause.
EXAMPLE 2
Another correct proof for Example 1 is:
0
This is because the formula itself can be refuted by unit-clause propagation.
EXAMPLE 3
Another correct proof for Example 1 is:
1 -3 0
1 3 0
-3 0
3 0
0
This is the sequence of resolvents in a resolution proof, but no
structure is given, so that fact might not be obvious.
1.3. NAVIGATION HINTS
The proof format is designed to be traversed forward or backward.
A clause of k literals occupies (k+1) integers of storage if the
proof is stored in a single array in the same format as the file.
The proof checker needs to set up its own data structures for
unit-clause propagation.
This is beyond the capability of a log-space program.
2. REMARKS
-------
The purpose of the RUP proof format is to allow a solver to "retro-fit"
some verification output without too much trouble, and to allow for
some possibility of derivation rules other than resolution.
One drawback is that checking the output is quadratic in its length,
by any methods known so far. Unit-clause propagation needs to be repeated
for each succeeding derived clause.
Another drawback is that the proof checker can report that a clause
is not verified, but cannot necessarily explain exactly why.
In this case, some clauses might be left over after unit-clause propagation
terminates.
If the formula is unsatisfiable, ANY clause is logically implied by it.
However, as a debugging tool, this system may well provide helpful
information.
The organizers of the SAT05 solver competition did not have
a program to check the RUP proof format, so it was not offered.
The organizers of the SAT09 solver competition do have
a tool to check the RUP proof format, consisting of some scripts
and a few programs.
The tool runs on Linux, and needs C and C++ compilers.
Together with upgrades since 2007, the current suite (2009) has verified
a RUP proof of 262 MB.
The programs for the resolution proof checker for the binary and ascii
formats accepted in 2005 are also available, and run on Linux.
The 2005 formats will be accepted in 2007.
All programs will be designed for a unix environment, such as
Linux, Solaris, etc., and will be written to work on both big-endian
and little-endian machines (without knowing which type the machine is).
In some cases, unix features might be used to improve the processing
of very large files, such as pipes, mmap, and forks.
The design should be such that specialized features are confined
to subroutines that can be replaced with less specialized
implementations, at the cost of performance and size limitations.
Any RUP proof checker will be in the complexity class known as "P-complete"
(by reduction from positive unit resolution, by reduction from
path systems in Cook's original paper).
This is a higher class than the "log space" class of the simplest
resolution proof checker.
Essentially, a "log space" algorithm has room to store only a
fixed number of integers; it cannot use an array of n integers, e.g.
Therefore, checking the correctness of a "log space" algorithm
is usually fairly easy.
On the other hand, a "P-complete" algorithm can use much more complex
data structures, and checking its correctness can be quite difficult.
In other words, when an RUP proof checker disagrees with a solver,
it is far from clear which one is correct.
Our rupToRes tool addresses this problem by generating a full resolution proof
in the "log space" format, i.e., %RES L32, as used in the 2005 competition,
which in turn is checked by our full resolution proof checker, checker3.
A generated %RES proof of 167 GB has been verified by checker3.