THF OVERVIEW FOR hotptp-bnf2-v3.2.0.6.txt by Allen Van Gelder (avg at host shown below) THF formulas are intended to represent typed lambda calculus formulas. This overview is somewhat informal and incomplete, with the goal of helping users compose and read THF formulas. Other useful files can be found in the same directory as this document, or at the URL http://cse.ucsc.edu/~avg/TPTPparser/. The syntax is completely specified by the BNF2 document: hotptp-bnf2.txt. Several example files (names beginning thf-) should help to clarify this text. A linux-executable parser for THF is also available as hotptp-yl-parser or hotptp-yl-parser-verbose. (This parser also parses TFF, FOF and CNF). Parsers for other platforms can be generated from available files. The semantics of THF formulas is largely beyond the scope of the TPTP project. PUNCTUATION STYLE The principles of style in TPTP for FOF (first order formulas) and CNF are followed, in general. 1. An expression with a binary operator as the principal operator must be enclosed in parentheses before being used in a larger expression. For purposes of this rule, the colon (:) and DEFINITION symbol (:=) are considered to be punctuation, not binary operators. 2. Multiple occurrences of the SAME associative binary operator can be enclosed in one pair of parentheses. 3. There are no precedence relationships among binary operators: scopes must be made explicit with parentheses. 4. An expression with a prefix operator as the principal operator does NOT need to be enclosed in parentheses before being used in a larger expression. 5. All operators are logical symbols, in the sense that their interpretation is fixed in the language. 6. To ensure that formulas are Prolog-readable, each colon (:) and each operator consisting of "symbol" characters should be followed by white-space. TOKENS (WORDS) The words or tokens in a thf formula consist of logical symbols, constants, variables, and punctuation symbols. 1. Variables begin with an uppercase letter; the underscore (_) is NOT considered an uppercase letter, differing from Prolog. 2. Constants introduced by users begin with a lowercase letter. Note that a constant can represent a function in this terminology. Also, a proposition is a function that returns a boolean value, so a constant can represent a proposition. Certain "defined constants" begin with a dollar-sign ($), followed by a lowercase letter. They currently include: the boolean truth-values, $true and $false, which denote the lambda-calculus expressions ^ [X, Y]: X and ^ [X, Y]: Y; the boolean type ($oType, $o), i.e., {$true, $false}; the type of individuals ($iType, $i), which is the normal range of first-order variables; the set of all types ($tType, $t). Pairs in parentheses are synonyms. The language admits "system constants", which begin with two dollar-signs ($$), followed by a lowercase letter, but their role in thf is unclear. Finally, strings enclosed in single quotes, strings enclosed in double quotes, and numbers denote constants. 3. The PRIMITIVE logical symbols represent abstraction (lambda, ^) and function application (@). "lambda" and "^" are synonyms. Abstraction is right-associative and application is left-associative, in accordance with the standard conventions of lambda calculus. 4. The TYPING logical symbols represent mapping (->, >), cross product (*), and type-union (+). "->" and ">" are synonyms. Mapping is right-associative, cross product and type-union are left-associative, following standard conventions. 5. The DEFINITION symbol is := and is punctuation, not a binary operator. Currently, the left operand of a definition (:=) must be an untyped constant, and the right operand can be any expression. 6. The TUPLE forming symbol is comma (,). Comma-separated lists enclosed in parentheses are TUPLES, which represent the PAIR function of lambda calculus. That is, (E1, E2) denotes ^ [X: $o] (X @ E1 @ E2), so that ((E1, E2) @ $true) = E1 and ((E1, E2) @ $false) = E2. However, comma-separated lists enclosed in square brackets ([]) are specific language constructs for abstraction and quantification. The comma is right-associative in both contexts, for thf formulas. 6. The remaining logical symbols can be regarded as notational conventions for certain fixed lambda expressions. They include !, ?, ~, &, |, =, among which "and" (&) and "or" (|) are left-associative, equality (=) is nonassociative, and "forall" (!), "exists" (?), and "not" (~) are prefixi and right-associative. There are some additional nonassociative binary operators that can be defined in terms of these (=>, <=, <=>, ~&, ~|). These symbols can be treated syntactically as constants by enclosing them in parentheses, e.g., (!), (?), (~), (&), (|), (=), etc. EXPRESSIONS The simplest expressions are untyped variables and constants, e.g., X1 and fib. The next simplest are typed variables and constants, e.g., X1: ((int * list) -> $o), fib: (nat -> nat). A typed constant can be an entire thf formula, e.g., thf(1, type, fib: (nat -> nat)). The syntax also permits a constant to be typed wherever it appears in an expression. The "prefix operators" corresponding to abstraction, quantification, and scoped definition consist of multiple tokens. In these examples, the parts that function as prefix operators are underlined: lambda [X1: ((int * list) -> $o)]: (g @ X1) ---------------------------------- ? [X1: ((int * list) -> $o), L]: ((X1 @ (3, L)) = $true) -------------------------------- letrec [t1 := (nat -> $o), c1 := ^ [X]: (X = (fib @ X))]: (c1:t1 @ 1) --------------------------------------------------------- The point is that the colon (:) that follows the right bracket (]) may be PARSED as a binary operator, but it is not considered to be a binary operator FUNCTIONALLY, so the syntax does not require that parentheses enclose the expressions shown. The same applies to both := and : occurring inside the square brackets. This is further illustrated in the next example, in which the three prefix operators are underlined: ^ [X1: ((int * list) -> $o)]: ? [G]: ~ (G @ X1) ----------------------------- ------ - THROUGH VERSION v3.2.0.4, Type expressions can use only typing operators and typing operators can only appear in type expressions. The atoms, or leaves, of a type expression are untyped variables and constants. BEGINNING VERSION v3.2.0.5, In principle, a type expression can be just about any lambda expression with "enough" parentheses, but it is expected that normally a type expression will consist of the typing operators, and lambda abstraction (^ and @) for polymorphic types. Normally, type expressions follow a colon (;) and apply to the untyped variable or constant that precedes the colon, as seen in the examples above. When a typed constant is the entire formula, it is taken as a global declaration; when it is within a larger expression, it's meaning is unspecified at present. Type expressions can also appear where a lambda-calculus expression is required, i.e, as either operand of "@". Definitions may appear only in the bracketed list following "letrec", or at top level. A definition takes the form "untyped_constant := expression". When definitions appear within a letrec, their scope is the formula that completes the letrec expression, as in: letrec [one := 1, two := 2]: (plus @ (one, two)). Global definitions appear at top level, comprising the entire formula. E.g., thf(1, definition, one := 1). thf(2, definition, (two := 2)). Note that a definition may optionally be enclosed in parentheses.