/*!
\page theory_api_howto HOWTO Write a Decision Procedure
Note: This document is under construction. Some newer aspects of adding a
decision procedure are not yet documented here. Please let us know if you have
a question that is not answered here.
\section theory_api_contents Contents
- \ref theory_api_prelim
- \ref theory_api_files
- \ref theory_api_naming
- \ref theory_api_general
- \ref theory_api_addnew
- \ref theory_api_start
- \ref theory_api_config
- \ref theory_api_header
- \ref theory_api_expr
- \ref theory_api_kind_decl
- \ref theory_api_kind_reg
- \ref theory_api_expr_subclass
- \ref theory_api_methods
- \ref theory_api_invars
- \ref theory_api_flow
- \ref theory_api_backtrack
- \ref theory_api_ileaves
- \ref theory_api_inputs
- \ref theory_api_assertFact
- \ref theory_api_checkSat
- \ref theory_api_setup
- \ref theory_api_update
- \ref theory_api_addSharedTerm
- \ref theory_api_rewrite
- \ref theory_api_solve
- \ref theory_api_computeType
- \ref theory_api_computeTCC
- \ref theory_api_notifyInconsistent
- \ref theory_api_print
- \ref theory_api_parseExprOp
- \ref theory_api_outputs
- \ref theory_api_enqueueFact
- \ref theory_api_setInconsistent
- \ref theory_api_simplifyThm
- \ref theory_api_enqueueEquality
- \ref theory_api_inconsistentThm
- \ref theory_api_isTerm
- \ref theory_api_isLiteral
- \ref theory_api_isAtomic
- \ref theory_api_updateHelper
- \ref theory_api_getType
- \ref theory_api_getTCC
- \ref theory_api_parseExpr
- \ref theory_api_proofs
- \ref theory_api_proof_classes
- \ref theory_api_proof_rule
\section theory_api_prelim Preliminaries and Coding Guidelines
\subsection theory_api_files Directory Structure and Files
Each theory (a.k.a. decision procedure, or DP) must reside in a
directory src/theory_foo (where foo is the theory
name). All the sounce files and Makefile.in must be in that
directory, except for some shared header files, which must reside in
the common directory src/include. In particular,
theory_foo.h must be placed in src/include.
The master source file for a theory should be called
theory_foo.cpp. Typically, a new theory needs its own set of
proof rules, which are implemented as three files:
foo_proof_rules.h, foo_theorem_producer.h, and
foo_theorem_producer.cpp. The first one is the abstract
interface to the rules, which the untrusted DP code can
\#include. The other two files implement this abstract API,
and comprise the trusted part of the code. All these files should be
placed in src/theory_foo.
\subsection theory_api_naming Naming Convensions
Each individual theory is a subclass of a Theory class, and therefore,
its name should be class TheoryFoo.
A theory may define several methods for creating expressions in
theory_foo.h. Typically, they are named as
barExpr(), bazExpr(), etc. For instance,
theory_arith.h defines plusExpr(), minusExpr(), etc. These
functions are most useful if they are declared as non-member functions
(not part of the class TheoryFoo), since then they can be
called directly by the CVC3 library users.
It is also convenient to define testers like isBar() and
isBaz(), which return true for the corresponding expressions.
For example, theory_arith.h defines isPlus(),
isMinus(), etc.
Of course, the rest of the code should follow our general naming and
coding guidelines described at length in the project FAQ:
http://verify.stanford.edu/savannah/faq/?group_id=1&question=How_Do_I_Become_A_CVC3_Developer.txt .
\subsection theory_api_general General Guidelines
The Theory API is designed to minimize the changes in the central
(core) files when adding a new theory. Ideally, only
src/vcl/vcl.cpp file needs to be modified to add a
constructor call to the new theory. Everything else --- kinds,
special expression subclasses, types, etc. --- can (and should) be
defined locally in the theory-specific files.
There are several important guidelines for writing
proof rules specific to your theory:
- Each proof rule must be mathematically sound.
- It is a mortal sin to introduce a proof rule without a proper
doxygen comment concisely and clearly describing what it does.
- The code must contain enough CHECK_SOUND checks to
guarantee that all the premises are of the right form, and all the
side conditions of the rule are satisfied. That is, if these
soundness checks pass, then the rule application is guaranteed to be
sound mathematically.
- Keep the code in each rule as clean and simple as possible, so
that the correctness of the implementation w.r.t. the mathematical
formulation can be easily checked by manual inspection. That is, many
simple and independent rules are strongly preferred to a few big
mega-rules.
- Avoid calling other proof rules from within a proof rule. Rule of
thumb: if your rule has to use other rules, most likely, it is a
strategy (a derived rule), and should be implemented in the
untrusted part of the code (e.g. in theory_foo.cpp).
More details are given in the separate
\ref theory_api_proofs "section on proof rules".
Finally, and most importantly: document your
code!!
Every method, variable, class, or type should have a doxygen-style
comment with at least a brief description:
\verbatim
//! Convert Blah expression to Baz value
Baz Blah2Baz(const Blah& b);
\endverbatim
Imagine telling such a description to your colleague who has very
faint idea of what you are coding. If a brief description does not
make sense to him/her, or the function does something non-trivial or
non-obvious, add a longer description:
\verbatim
//! Convert Blah expression to Baz value
/*!
* Recursively descend into Blah, collect every Foo-leaf, order them in
* the descending order of Baz-ability, and wrap into the Baz expression.
*
* \param b is a non-trivial Blah value (otherwise we assert-fail).
*/
Baz Blah2Baz(const Blah& b);
\endverbatim
Often it is convenient to keep the brief description in the
*.h file, and the long description in the corresponding
*.cpp file.
The only exceptions are the \ref theory_api "Theory API methods", for
which no documentation is necessary, since these are already
documented in theory.h.
\section theory_api_addnew Adding New Theory
\subsection theory_api_start Before You Start...
...Make sure you have a very good idea about what your new decision
procedure is supposed to do, why it is useful, and what exactly it
decides.
Next, write down all the definitions mathematically and phrase the
scope of the new theory in terms of a logic with a particular
signature (the set of predefined interpreted and uninterpreted symbols
that belong to this theory). Make sure this signature is disjoint
from all the other theories (this is very important,
since CVC3 is based on Nelson-Oppen combination result, and
requires signatures to be disjoint).
Your decision procedure should decide inconsistency of a set of
formulas in the above logic. Make sure you actually know an algorithm
for that. While it is possible to add undecidable or incomplete
theories to CVC3 (and we already have some), those should only be
used for a very good reason. Otherwise, make sure you know that your
algorithm is sound, complete, and terminating.
Take the above algorithm and make it an online one. That is,
your algorithm should be able to accept a new formula at any point in
time, and perform some incremental computation to take that formula
into account. This is usually the most complicated step as far as the
math is concerned. I will explain this step in detail in section \ref
theory_api_invars "Theory API" below.
If you are adding your theory to the official CVC3 code base,
discuss your ideas on the mailing
list! Yes, do it, I mean it. If you do not,
we will notice your sneaky additions right away, and will ask many
difficult questions in public. So, you may as well just announce it
yourself.
\subsection theory_api_config Configure and Makefiles
Now you are ready to begin. Pick a name for your new theory, say,
Foo, and do the following steps.
- Create a new CVS branch. You already know how to do this, right?
If not, read the FAQ:
http://verify.stanford.edu/savannah/faq/?group=vergrp .
Read it NOW.
- Create a directory src/theory_foo, add it to CVS
- Create the following files (either empty or copy them from an
existing theory), and add them to your new CVS branch:
- In src/include/
- theory_foo.h
- In src/theory_foo/
- theory_foo.cpp
Makefile.in
If your new theory needs to create new custom proof rules, also
create
- in src/theory_foo/
- foo_proof_rules.h
foo_theorem_producer.h
foo_theorem_producer.cpp
While you at it, add the required header comment to
each source file (*.h and *.cpp), with the appropriately modified
file name, author, date, and an optional description after
the license excerpt. Look at any existing source file for a
template.
- Edit src/theory_foo/Makefile.in appropriately (look for a
template from some other theory, for example,
theory_array). You need to modify the values of:
- LIBRARY
- LOCAL_OBJ_DIR and LOCAL_SRC_DIR
- SRC
- HEADERS (if you created any *.h files in
src/theory_foo/, such as foo_proof_rules.h).
- Edit src/Makefile.in (it is a common Makefile for all theories):
- Add theory_foo.h to HEADER_NAMES
- Add cd theory_foo \&\& \$(MAKE) \$(TARGET) to
build: target
- Add theory_foo entry to CVC_LIB_NAMES
- Edit configure.in:
- Add an entry \$CVC_SRC_SUBDIR/theory_foo/Makefile to
CVC_OUTPUT_FILES variable
- Run autoheader, autoconf, ./configure.
- Edit src/vcl/vcl.cpp to add a call to your theory's constructor:
- Add \#include "theory_foo.h" at the top;
- Add d_theories.push_back(new TheoryFoo(this)); to
the end of VCL::VCL() constructor.
File src/vcl/vcl.cpp is the only source file
in the core of CVC3 that you really need to touch in order to add
a new theory. If you have to do something else, you are either doing it
wrong, or there better be a very good reason for it.
Now you are all set for compiling your new code, except that there is
no code to compile yet...
\subsection theory_api_header Header File: theory_foo.h
The easiest way to start is to use an existing
theory_whatever.h file as a template. This header file
should declare the following elements:
- class TheoryFoo as a subclass of Theory;
- Non-member (global) functions for creating expressions in your
theory, and possibly, testers for those expressions; for instance:
plusExpr(), multExpr(), ..., isPlus(),
isMult(), ...
- Optionally, declaration of new kinds (an enum type) used
in the above expressions.
Declaration of TheoryFoo class is required. Kinds and
functions for creating new expressions are only needed if you want the
end-user of the CVC3 library to be able to create expressions from your
theory and/or refer to their kinds directly. This is not always
desirable. For instance, you may want to generate special terms or
predicates that only make sense as temporary storage of information
for your theory. The DARK_SHADOW and
GRAY_SHADOW operators are good examples from the arithmetic
theory. You have no clue what I'm talking about? That's right, you
get it.
\subsection theory_api_expr Kinds, Expressions, and Types
The purpose of a new theory (or a decision procedure) is to
extend the existing logic of CVC3 with new interpreted, and
sometimes uninterpreted, operators and symbols. These new symbols
comprise a signature of the theory, and it must be disjoint
from the signatures of the other theories.
In code, the new symbols and operators translate into new kinds of
expressions. That is, new kinds and new values of the
Expr class.
A kind is a natural number which uniquely determines what
sort of expression it is (variable, uninterpreted function symbol,
specific operator like plus, minus, a type expression, etc.). Kinds
are also used to define your theory's signature, and hence,
they must be unique to your theory.
In the simplest case, a kind represents the entire expression or type.
Examples are simple types (like REAL and INT) and
some constants (e.g. TRUE and FALSE).
More typically, however, a kind represents an operator (PLUS,
MINUS, ...), and children of an expression of that kind are
the arguments of that operator.
Finally, some kinds are used by more complex expressions with
additional non-term attributes. For example, REC_LITERAL is
used by record expressions { f1=e1, f2=e2, ... } whose children are
the values of the fields (e1, e2, ...), and an additional attribute is
the vector of field names: (f1, f2, ...). Other examples are
quantifiers and lambda-terms. Such expressions require not only a new
kind declaration, but also a special subclass of ExprValue
(see below).
\subsection theory_api_kind_decl Kind Declaration
Kinds are declared as elements of an enum type (with doxygen
comments):
\verbatim
//! Local kinds of TheoryFoo
typedef enum {
FOO_TYPE = 3500, //!< Type FOO for the elements of this theory
BAR, //!< The binary (x | y) operator, where x,y: FOO
BAZ, //!< Unary buzz(x) predicate
BLEAH //!< An internal auxiliary term
} FooKind;
\endverbatim
Notice, that the numbering in this example starts from 3500. This can
be any integer which ensures that the new kinds do not clash with the
existing kinds in other theories. Pick a random one, see if your
kinds don't clash with others. You can check this either by
inspecting src/include/kinds.h (the cental declaration of
core kinds) and all the other theories, or compile and run
cvc3 to see if you get an error message about kinds
registered twice.
The above type declaration of FooKind can be either in
src/include/theory_foo.h (usually the best place), or in some
header file in src/theory_foo/ directory, depending on
whether you want to expose your kinds to the end library user or not.
In the latter case, do not forget to add that header file to the
HEADERS variable in your Makefile.in (see the \ref
theory_api_config "previous section").
\subsection theory_api_kind_reg Kind Registration
New kinds must be registered with the expression manager, which is
done in your theory's constructor (see the
\ref theory_api_methods "next section").
Registration is done by calling newKind() method of
ExprManager for each kind:
\verbatim
getEM()->newKind(FOO_TYPE, "FOO");
getEM()->newKind(BAR, "||");
....
\endverbatim
The string in the second argument is for printing the kind by the
pretty-printer, and also for parsing, that is, turning a string back
into a number (the kind).
\subsection theory_api_expr_subclass New Expression Subclasses
Sometimes an expression has a more complicated structure than a fixed
operator with arguments, and more sophisticated data structures are
necessary to represent it. CVC3 has an API for declaring and
registering a new expression subclass just for that purpose.
An expression subclass is a subclass of \ref ExprValue "ExprValue".
You can declare it where appropriate in your theory
files, using the same judgement as for the kinds. The new subclass
needs to be registered with \c ExprManager by calling
registerSubclass() method.
A subclass of ExprValue must implement the
following methods:
\verbatim
size_t computeHash() const;
size_t getMMIndex() const;
int isGeneric() const { return getMMIndex(); }
bool isGeneric(int idx) const { return (idx == getMMIndex()); }
// Syntactic equality of two expressions
bool operator==(const ExprValue& ev2) const;
// Make a clean copy of itself using the given memory manager
ExprValue* copy(MemoryManager* mm, ExprIndex idx = 0);
\endverbatim
Also, each subclass must overload new() and delete()
as follows:
\verbatim
void* operator new(size_t, MemoryManager* mm) {
return mm->newData();
}
void operator delete(void*) { }
\endverbatim
Subclasses may overload other virtual methods of ExprValue as
needed. For instance, arity(), getKids(),
ExprValueGenericAttr "getXXXAttr"(), etc. Some standard
attribute access functions can also be overloaded,
e.g. getString(), getRational().
However, you should not overload testers such as
isRecord(), isVar() and such (unless you really know
what you are doing), since the core relies on specific properties of
the corresponding subclasses.
For example, let us say that the auxiliary term BLEAH in
TheoryFoo needs a string attribute, which is a part of
expression, but is not a child (and not even a term in the logic), and
an integer attribute which is not part of the expression (e.g. it is
needed for marking expressions during the algorithm run). An example
of a new subclass for this expression is the following:
\verbatim
class BleahExpr: public ExprValue {
private:
string d_str; //!< Data field; defines the value of expression
int d_int; //!< An attribute
size_t d_MMIndex; //!< The registration index for ExprManager
public:
//! Constructor
BleahExpr(ExprManager* em, const string& str,
size_t mmIndex, ExprIndex idx = 0)
: ExprValue(em, ARRAY_VAR, idx), d_str(str),
d_int(0), d_MMIndex(mmIndex) { }
//! String attribute (part of expression)
const std::string& getString() const { return d_str; }
//! Integer attribute (not part of expression)
int& getIntAttr(int idx) { return d_int; }
size_t getIntAttrSize() const { return 1; }
ExprValue* copy(MemoryManager* mm, ExprIndex idx = 0) const {
return new(mm) BleahExpr(d_em, d_str, d_MMIndex, idx);
}
size_t computeHash() const {
return PRIME*ExprValue::hash(BLEAH)+s_charHash(d_str.c_str());
}
size_t getMMIndex() const { return d_MMIndex; }
size_t isGeneric() const { return getMMIndex(); }
bool isGeneric(size_t idx) const { return (idx == getMMIndex()); }
// Only compare the string, not the integer attribute
bool operator==(const ExprValue& ev2) const {
if(ev2.getMMIndex() != d_MMIndex) return false;
return (d_str == ev2.getString());
}
void* operator new(size_t, MemoryManager* mm) {
return mm->newData();
}
void operator delete(void*) { }
};
\endverbatim
Registration of this subclass generates a memory manager index, which
must be stored somewhere, usually in a class variable of
TheoryFoo, say d_bleahIdx of type size_t. Registration in this
case should be done in the constructor of TheoryFoo:
\verbatim
d_bleahIndex = getEM()->registerSubclass(sizeof(BleahExpr));
\endverbatim
The approved and recommended way of creating
expressions of BLEAH kind is the following:
\verbatim
Expr bleahExpr(const string& s) {
BleahExpr av(d_em, s, d_BleahIndex);
return newExpr(&av);
}
\endverbatim
Using the new expression is now very easy:
\verbatim
Expr bleah = bleahExpr("cow moo"), b2 = bleahExpr("asdfqwerty");
bleah.getIntAttr(0) = 42;
if(bleah != b2)
cout << bleah.getString() << b2.getIntAttr(0) << endl;
\endverbatim
Note that it is impossible to define a non-member expression creation
function for BleahExpr, since the memory manager index is
stored in the class-local variable of TheoryFoo. And
don't even think of using a static variable to work
around this limitation.
In general, you should NEVER store anything in a
static variable, since it may violate thread-safety of the library.
Before you ever think of using a static variable for anything, think
what happens if someone creates two copies of the system (with two
different sets of expression and memory managers). Is it safe to
share this variable among several system instances? In the case of
the memory manager index, the answer is definitely not.
One possible fix for this problem is to bind the memory manager to the
kind(s) that the subclass uses. Let us know if you really need this
feature, and it will be implemented.
\subsection theory_api_methods Constructor and API Methods
The only constructor for your new theory should have the following
declaration:
\verbatim
TheoryFoo(VCL* vcl);
\endverbatim
The constructor has to:
- Initialize the base class,
- Register new kinds and expression subclasses (if any) with
ExprManager,
- Collect the kinds which belong to the new theory in a vector, and
register the theory (method registerTheory()) with these
kinds,
- Initialize theory-specific data structures, if needed.
Here is a typical example of the constructor implementation:
\verbatim
TheoryFoo::TheoryFoo(VCL *vcl): Theory(vcl, "Foo") {
d_rules = createProofRules(vcl); // instantiate our own rules
// Register new local kinds with ExprManager
getEM()->newKind(FOO_TYPE, "FOO");
getEM()->newKind(BAR, "||");
getEM()->newKind(BAZ, "BAZ");
getEM()->newKind(BLEAH, "BLEAH");
// Register our expression subclass
d_bleahIndex = getEM()->registerSubclass(sizeof(BleahExpr));
vector kinds;
kinds.push_back(FOO_TYPE);
kinds.push_back(BAR);
kinds.push_back(BAZ);
kinds.push_back(BLEAH);
registerTheory(this, kinds);
}
\endverbatim
The following \ref theory_api "Theory API methods" are
required in the subclass:
\verbatim
void assertFact(const Theorem& e);
void checkSat(bool fullEffort);
void computeType(const Expr& e);
\endverbatim
Other methods are optional, but often needed:
\verbatim
Theorem rewrite(const Expr& e);
void setup(const Expr& e);
void update(const Theorem& e, const Expr& d);
ExprStream& print(ExprStream& os, const Expr& e);
Expr parseExprOp(const Expr& e);
\endverbatim
Finally, a few other methods are rarely needed in practice:
\verbatim
void addSharedTerm(const Expr& e);
Theorem solve(const Theorem& e);
void notifyInconsistent(const Theorem& thm);
Expr computeTCC(const Expr& e);
\endverbatim
The next section describes these API methods in more detail. You are
also strongly encouraged to \ref theory_api "read the documentation"
on each of these methods.
\section theory_api_invars Theory API: The Very Important Invariants
\subsection theory_api_flow High-Level Information Flow
Every decision procedure communicates with the CVC3 Core
interactively through several methods, most of which belong to the
Theory class. Generally, those methods that your theory class
re-implements carry information from the core, and others add
new information generated in the DP to the core. Some
methods (like rewrite() and solve()) return
information through their return values.
The chart below shows the flow of information to and from the decision
procedure, and which parts of the core are responsible for collecting
and generating it. Thick lines represent the most important methods,
and dashed lines represent methods used only for very special
occasions.
\image html theory_api_flow.jpg
\image latex theory_api_flow.eps "Flow of facts in CVC3" width=6in
The most straightforward path of information, once it gets to the core
through the external user input, is the following. First, all the
relevant terms are typechecked (computeType() method). This
method implements a step in the recursive typechecking algorithm,
where the current expression is typechecked based on its structure and
the types of its children. Typechecking of children is done by
calling getType() or getBaseType(). The latter
computes the largest supertype of the expression; for instance, the
exact type of x may be a subrange 0..5, which is a
subtype of INT, but its base type is REAL. Exact
and base types are cached on expressions, and are computed on demand.
An important property of computeType() is that it must not
only compute the exact type of the expression, but also verify that
all subexpressions are type-correct, relative to their base
types. For instance, x = y has type BOOLEAN,
and it is only type-correct if getBaseType(x) ==
getBaseType(y). If this property is violated,
TypecheckException must be thrown with the appropriate
message. Note, that the exact types of x and y may
be different, and even disjoint.
After the type checking, Type Correctness Conditions (TCCs)
are generated and checked. TCC is a formula which is true if and only
if any partial function in the original formula is used safely
according to the Kleene semantics. That is, every partial function is
either applied only to the arguments in its domain, or its value does
not influence the value of the formula.
TCCs are computed recursively by computeTCC() and
getTCC() methods, very similar to computing the types. If
your theory does not introduce partial functions explicitly (like
division in arithmetic), then you do not need to re-implement
computeTCC() in your theory; the default implementation will
do the job.
TCCs have a nice property that if they are true in the current
context, then the corresponding user formulas can be safely
interpreted by the total 2-valued models. Hence,
computeTCC() is the only Theory API method
that deals with partiality. All other methods consider any formula to
be total (no partial functions) and 2-valued (only TRUE or
FALSE, no undefined values.
Once TCC has been proven valid in the current context, the new fact
(formula) goes to the SAT solver, and if it is a literal (atomic
formula or its negation), it is submitted to the decision procedure
through assertFact(). The decision procedure processes this
fact, updates its internal data structures, and possibly reports a
contradiction (setInconsistent()) or new facts
(enqueueFact(), and in special cases,
enqueueEquality()). This completes the main loop of
information flow.
Note, that both enqueueFact() and setInconsistent()
deliver information to the same place in the core, except that
reporting the conflict bypasses the queue, and is taken care of
immediately, rather than after all the previously enqueued facts are
processed. This is the primary reason for having these two functions
separated (as opposed to having only enqueueFact()).
Inside this main loop, rewrite() and solve() are
called to transform (or simplify) the facts before they reach the rest
of the core. Normally, these functions do not have side-effects
(except for caching results), and return new (simplified) facts
through their return values.
When the SAT solver runs out of facts, and the context is still
satisfiable, it calls checkSat() with
fullEffort==true. At this point, the decision procedure must
determine whether all the information it has seen so far makes the
context satisfiable or not w.r.t. its theory. Just like in the case
of assertFact(), it may either report a contradiction, or
enqueue a new fact. If any new fact is enqueued, it starts the main
loop again. If checkSat() does not generate any new facts
and does not find a contradiction, the core stops and reports the
context to be satisfiable.
Method checkSat() is also called every time the fact queue
becomes empty, before the SAT solver asserts a new splitter. In this
case, the fullEffort argument is set to false, and
the decision procedure is not required to do anything. Many DPs,
however, choose to perform some relatively inexpensive checks to
detect inconsistencies and/or new facts, which increases performance.
Similarly, if a new fact is enqueued, the main loop continues (without
the SAT solver asserting new splitters) until the queue is empty.
In CVC3, every term (non-formula expression) has a canonical
representative in the union-find database. This database
represents the equivalence classes of terms w.r.t. logical equality.
All the terms in the formulas passing through the core are
simplified by replacing them with their canonical
representatives.
Often, a decision procedure wants to be notified when a subexpression
changes its canonical representative. For instance, if the DP has
seen an term 2*x+3*y, and x has changed its
representative to y+2, then it is important to conclude that
2*x+3*y == 2*(y+2) + 3*y == 5*y+4. For this purpose, the
core maintains the notify list data structure, which is
interfaced through setup() and update() methods.
Every term in the core must be setup, and as a part of that
process, the method setup() of the appropriate DP is called.
Here the decision procedure has a chance to register notification
requests related to the given expression. These requests are added to
the notify lists of the relevant expressions using
Expr::addToNotify() member method.
Normally, a DP wants to be notified when immediate children of the
expression change. For instance, for an expression 2*x, if
the variable x changes, the arithmetic DP wants to be
notified about it. Therefore, in the setup(2*x) call, it
adds 2*x to the notify list of x by calling
x.addToNotify(this, 2*x). The first argument (this)
is the reference to the current Theory subclass.
Later, when x changes its canonical representative, say, to
y+2, its notify list is consulted, and the update(x==y+2,
2*x) call is made. The first argument is a directed equality
informing the DP of what has changed, and the second is the expression
for which this change is relevant. In this particular example,
update() will enqueue a new fact: 2*x==2*y+4.
Similarly, when 2*x+3*y is being setup, its immediate
children (2*x and 3*y) get the entire expression
added to their notify lists. Later, when 2*x changes its
canonical representative to 2*y+4 due to the previous
update() call, another update() call is made with
2*x==2*y+4 for 2*x+3*y, and a new fact is
enqueued: 2*x+3*y==5*y+4, and so on.
Note, that the notify list mechanism is not restricted to only
immediate children. For instance, for high-degree monomials in
non-linear arithmetics (e.g. x^2*y) it makes sense to
register them with all factors (in this case, x,
x^2, y, and x*y) which are not necessarily
subexpressions of the original monomial (x^2*y).
Finally, sometimes a decision procedure may want to know that the
current context has become inconsistent, and this what
notifyInconsistent() call is for. To date, only the
quantifier theory uses it to find out which instantiations were useful
in producing a conflict. Most likely, you do not need it.
\subsection theory_api_backtrack Backtracking Data Structures
Up to this point, the description assumed that new facts are always
added to the current logical context, and never
removed. This, of course, is not true in reality, since when a
conflict is found, the SAT solver will backtrack, and
restore the context to what it was before the assertion of a
splitter. In particular, this means that each decision procedure
needs to restore all its internal state to the same point. You may
have noticed that there is no Theory API call to signal such
backtracking. How the heck can a DP restore the state if it does not
know when the core backtracks?
The trick used in CVC3 is actually quite simple and elegant: DP
does not have to know about backtracking, it indeed works
under the assumption that facts can only be added to the context.
However, all of its internal state must be stored in backtracking
data structures, which backtrack automatically with the core.
Such backtracking data structures are called context-dependent
objects (CDO). There are currently three pre-defined
context-dependent data structures: CDO (context-dependent
object, cdo.h), CDList (backtracking stack,
cdlist.h), and CDMap (backtracking map, similar to
STL map, cdmap.h).
Class CDO is a templated class for any C++ class which can be cleanly
copied with operator=() and copy constructor, and which have
the default constructor (this is how these objects are saved and
restored on backtracking). CDO is best suited for individual
variables (array indices, Expr or Theorem variables,
etc.).
Class CDList is a backtracking stack, and its API is very similar to
that of STL vector. You can push_back() elements onto the
stack, check the size() of the stack, and look up individual
elements with operator[]. You cannot, however,
modify or remove elements from the list. Keep in mind, that the size
of the list may change between the API method calls, which means, you
should keep any persistent indices to the list in backtracking
variables (CDO).
Class CDMap is a templated class very similar to STL map.
You can add new key-value pairs to it, you can modify the value under
a key, but you cannot remove a pair from the map.
Let me repeat this again: all persistent data in a decision
procedure MUST be stored in backtracking data structures!
There are some rare exceptions to the rule (like storing the
Expr representing the value "0" to avoid re-building it), but
generally, you do not even want to know about backtracking. It is all
done under the hood, and you should not care.
\subsection theory_api_ileaves Variables and Foreign Terms (i-Leaves)
Any subterm that your decision procedure cannot recognize must be
treated as a variable. This is a very important point, and it may
cause a long-lasting confusion for the beginning developers if not
understood from the start. Read it carefully, several times, until
you are sure you never forget it, even if I wake you up in the middle
of the night.
What CVC3 knows as a "variable" has nothing to do with what a
decision procedure considers a "variable." These two are not very
much related.
A variable (or an i-leaf) from the DP point of view is either
a CVC3 variable, or a shared term from some other theory.
For instance, in 2*arr[idx]-3*y, the subterm
arr[idx] belongs to the theory of arrays, and therefore, is a
variable (an i-leaf) as far as the theory of arithmetic is concerned.
Similarly, y is a variable in the theory of arithmetic,
because it is also a CVC3 variable.
Such a definition does not provide a direct test for an i-leaf.
Instead, you have to check whether this term is one of "yours" (one
that your theory knows about), which is usually determined by the
expression kind. If not, then it is a variable, as far as your theory
is concerned.
But never make any assumptions about an i-leaf; it can be any
expression whatsoever, and Expr::isVar() tester will
not necessarily return true for it. In other words,
there is no such thing as a variable in your theory. There
are only terms you cannot recognize, which you treat as variables.
\subsection theory_api_inputs Methods Reimplemented in a Subclass
\subsection theory_api_assertFact assertFact()
There are no tricky invariants for this method. The only important
property is that all the facts that are submitted to the DP through
this call become part of the logical context which the DP must check
for satisfiability.
Mathematically, the asserted fact \f$\phi\f$ is added to the logical
context \f$\Gamma\f$, and the job of the decision procedure is to
check whether \f$\Gamma\f$ is satisfiable or not; in other words,
we are solving the problem \f$\Gamma\models\bot\f$.
When the decision procedure receives a new fact \f$\phi\f$, it may
either save this fact in its internal database for later processing,
or may immediately process it, and possibly derive new facts
\f$\{\psi_1,\ldots,\psi_k\}\f$ from \f$\phi\f$ and submit them back to
the core (enqueueFact()).
In the case when the set of derived facts is equisatisfiable with the
original fact \f$\phi\f$, the decision procedure does not need to keep
\f$\phi\f$ in its database; the completeness will still be preserved.
For instance, if the DP receives r1=r2, where r1 and
r2 are records with fields f1 and f2, then
the two facts r1.f1=r2.f1 and r1.f2=r2.f2
(equalities of the individual fields) together are logically
equivalent to the original fact r1=r2, and therefore,
enqueuing them is sufficient for preserving completeness. The
original fact need not be saved.
\subsection theory_api_checkSat checkSat(bool fullEffort)
The most important invariant (for completeness) is that when
fullEffort is true, the DP must do all the work that
it has postponed to find out if the current context is indeed
satisfiable. In particular, if satisfiability can be achieved by
making some of the shared terms equal, it must be done at this time
(see \ref theory_api_addSharedTerm for more info on shared terms).
This call is your last warning: if you do not act now, the whole
system will stop and notify the user. However, the worst that can
happen is that CVC3 becomes incomplete (it may report
InValid when the query is actually valid). It still remains
sound, however. That is, the Valid answer will still be
correct.
When fullEffort is false, the DP may choose to do as
much or as little work as it wants.
\subsection theory_api_setup setup()
Add the given expression e to the notify list of all the
expressions t1...tn whose change would
affect the value of e. Normally, such expressions are the
immediate children of e.
Whenever the canonical representative of any ti
changes in the union-find database, a corresponding call to
update() will be made, and the DP will have a chance to
re-process the expression e to keep it up-to-date.
\subsection theory_api_update update()
The property of this call is similar to assertFact(): the new
fact becomes part of the logical context. However, the facts it
receives do not necessarily belong to your theory, and are only
reported because you asked the core to do so in setup().
Also, the new equalities that update() derives must be
submitted through enqueueEquality() call. This also means
that the right-hand side of the submitted equalities must be fully
simplified. See \ref theory_api_enqueueEquality for more
information.
\subsection theory_api_addSharedTerm addSharedTerm()
A term is called shared if it belongs to theory X, and appears
as an i-leaf in a term from theory Y (that is, it's a Y-leaf). In
this case, the term is shared by theories X and Y. For example, in
2*arr[idx]-3 the subterm arr[idx] belongs to the
theory of arrays, but the entire term is an arithmetic expression;
hence, arr[idx] is a shared term.
When such a term appears in the system, the core notifies both
theories about the term.
Completeness of the CVC3 framework relies on the invariant that
decision procedures propagate all the equalities between shared terms
that can be derived in the current logical context.
Often, the algorithms in DPs are designed to propagate all the
equalities automatically (over all terms, including shared). In this
case, addSharedTerm() need not be re-implemented.
In some cases, however, the DP has to take extra effort to satisfy the
above invariant, and it is more efficient to restrict this extra
effort only to the set of shared terms. In this case,
addSharedTerm() needs to collect the set of shared terms in a
database
(which, of course, has to be \ref theory_api_backtrack "backtrackable"),
and use it in the checkSat() call.
\subsection theory_api_rewrite rewrite(Expr e)
This function must return a rewrite Theorem of the form
e==e1 (or e<=>e1 if e is a formula), where
e1 is a logically equivalent term or formula.
This function can assume that all the immediate children of e
are already completely simplified and rewritten. The same property
must hold for the result of the rewrite.
Another invariant that rewrite() has to preserve is that if
the result of a rewrite is an equality (you return
e<=>(e1==e2)), then in the resulting equality e1 >=
e2 w.r.t. operator>=(Expr, Expr), the fixed total
ordering on all expressions given by the expression manager. This
invariant is important for termination of the simplifier, since
equalities in CVC3 are used as (directed) rewrite rules, replacing
the left-hand side (e1) with the right-hand side
(e2).
The core will call the rewrite() function iteratively on the
right-hand side of the result, until the expression does not change.
However, if the rewriting algorithm can guarantee that in a particular
case no further rewrites from this theory will change the expression,
the result can be flagged as a normal rewrite. In this case,
the core will not call rewrite() again, resulting in better
performance. The property that the expression indeed does not change
with further rewrites is checked in the "debug" build, and any
violation triggers assertion failures with ``Simplify Error
1'' and ``Simplify Error 2'' messages.
It is important to understand that the iterative call to
rewrite() only applies to the top-level node, and
not to subexpressions. That is, if rewrite()
changes the subexpressions (and not only the top-level operator), then
it may violate another invariant that all the children of the result
are completely rewritten and simplified. If this invariant cannot be
guaranteed, then rewrite() needs to call
simplifyThm() method explicitly.
Here is an example of a rewrite function:
\verbatim
Theorem TheoryFoo::rewrite(const Expr& e) {
Theorem res;
if(isBar(e)) {
res = reflexivityRule(e);
res.getRHS().setRewriteNormal(); // No more rewrites needed
} else {
// May need to rewrite several times
res = < do real work >
}
return res;
}
\endverbatim
\subsection theory_api_solve solve(Theorem e)
This method takes an equality e (as a Theorem
object) and turns it into a logically equivalent solved form:
a conjunction of fully simplified equalities, possibly existentially
quantified. The terms on the left-hand sides cannot appear on any of
the right-hand side terms, and every free variable in the solved form
is also a free variable of e. (New variables in the solved
form must be existentially quantified).
According to Clark Barrett's Ph.D. thesis, only one theory is allowed
to have a solver. In CVC3, such theory is the theory of
arithmetic. The restriction to a single solver in CVC3 is
somewhat relaxed, and several theories can have their own solvers,
provided that the solved form that such a secondary solver generates
is also a solved form w.r.t. the theory of arithmetic. This is the
only asymmetric and non-local invariant in the core of Theory API.
\subsection theory_api_computeType computeType()
The basic CVC3 type checking mechanism is a simple recursive
descent into the term structure, and it is implemented as a
getType() method in the base Theory class.
When computing a type of an expression e, this method
determines which DP owns the expression, and calls the appropriate
computeType() method, which is expected to check the
expression for type consistency, and return the exact type of the
expression. The return type is then cached as an attribute on the
expression e for a fast look-up in the subsequent calls to
getType().
Each decision procedure must implement computeType() method
for all of its operators. For example, the theory of records has an
operator for constructing record literals, for extracting a field of a
record, and for updating a field of a record. This means that
computeType() needs to be able to compute the types for these
three operators, and verify that all subexpressions are of expected
types.
Since subtypes in CVC3 are handled by TCCs, type consistency at
this stage is only checked with respect to the base types,
which is returned by getBaseType() method provided by the
base Theory class. For example, if a record expression
e has a field foo of type INT, and the
expression is a record update e WITH .foo := t, where
t is of type REAL, then this expression is
considered well-typed, since the base types of both INT and
REAL is REAL.
An important property of computeType() is that it must not
only compute the exact type of the expression, but also verify that
all subexpressions are type-correct, relative to their base
types. For instance, x = y has type BOOLEAN,
and it is only type-correct if getBaseType(x) ==
getBaseType(y). If this property is violated,
TypecheckException must be thrown with the appropriate
message. Note, that the exact types of x and y may
be different, and even disjoint.
\subsection theory_api_computeTCC computeTCC()
Type Correctness Condition (TCC) for an expression e (which
can be either a term or a formula) is a formula De
such that De is true if and only if e is
defined (or denoting) in the current logical context.
For example, an expression x/y is undefined when
y=0, and is defined otherwise. Therefore, \f$D_{x/y}\equiv
y\ne 0\f$.
\subsection theory_api_notifyInconsistent notifyInconsistent()
\subsection theory_api_print print()
The most important property of this method is that the printed
expressions have to be parsable by the appropriate CVC3 parser.
That is, CVC3 must be able to read what it prints.
The recursive call to the global pretty-printer is implemented through
the overloaded operator<< for ExprStream. Read the
documentation on ExprStream class before coding.
Once coded, test your printer code! Print all the
kinds of expressions from your theory, make the expressions large and
complex, interspersed with terms from other theories, etc. Make sure
it both looks good, and CVC3 can read every term it prints.
Here's an example of the print() method:
\verbatim
ExprStream&
TheoryFoo::print(ExprStream& os, const Expr& e) {
switch(os.lang()) {
case PRESENTATION_LANG:
switch(e.getKind()) {
case ARROW:
os << "[" << push << e[0] << space << "-> " << e[1] << push << "]";
break;
case EQ:
os << "(" << push << e[0] << space << "= " << e[1] << push << ")";
break;
case NOT: os << "NOT " << e[0]; break;
.................
default:
// Print the top node in the default LISP format, continue with
// pretty-printing for children.
e.print(os);
}
break; // end of case PRESENTATION_LANGUAGE
case INTERNAL_LANG:
..................
break; // end of case INTERNAL_LANG
default:
// Print the top node in the default LISP format, continue with
// pretty-printing for children.
e.print(os);
}
return os;
}
\endverbatim
\subsection theory_api_parseExprOp parseExprOp()
This method is not used yet, and is likely to change in the near
future.
\subsection theory_api_outputs Methods Provided by Theory API
The Theory API consists not only of methods to re-implement in the
subclass, but it also provides convenient methods in the base class
which subclasses can readily use. These methods subdivide into the
following categories:
- Methods for sending information to the core from a theory (a
decision procedure),
- \ref theory_api_core_proof_rules "Common proof rules",
- Other convenient methods.
\subsection theory_api_enqueueFact enqueueFact()
Normally, a decision procedure should use this and only this
method for reporting newly derived facts back to the core. The only
exception is a contradiction (a FALSE Theorem), which should
be reported through setInconsistent() method for efficiency.
Other exceptions include facts derived by the update()
function, which may be reported through
\ref theory_api_enqueueEquality.
\subsection theory_api_setInconsistent setInconsistent()
Similar to \ref theory_api_enqueueFact, except that it
requires the new Theorem to be FALSE (a contradiction).
This method is used for more efficient processing of the derived
contradiction, bypassing the fact queue.
\subsection theory_api_simplifyThm simplifyThm() and simplify()
In rare cases, a decision procedure may want to simplify a given
expression w.r.t. the current context, and this is the function to
call.
Be careful! This method may call your own
\ref theory_api_rewrite method recursively.
It is also a relatively expensive function to call, so avoid it if
possible.
\subsection theory_api_enqueueEquality enqueueEquality()
This function can be used in \ref theory_api_update to
propagate the equalities induced by the given equality e1==e2
as the argument to update().
In CVC3, equalities are treated as directional, meaning
that left-hand side is always being replaced by the right-hand side.
This means that if in the expression d there is a
subexpression e1, then it must be replaced by e2.
It also means that the resulting expression d2 must be
reported to be equal to the original one as d==d2, and
not the other way around.
Since the core may occasionally swap equalities submitted through
\ref theory_api_enqueueFact (for termination reasons), it is
important to submit the above equality by-passing the swapping engine.
This is where enqueueEquality() is useful.
Invariant: enqueueEquality() expects the
argument to be a Theorem of the form e1==e2, where
e2 is a fully simplified expression in the current
context. That is, e2 == simplify(e2). You are responsible
for maintaining this invariant in your decision procedure.
\subsection theory_api_inconsistentThm inconsistentThm() and inconsistent()
If the context is inconsistent, the inconsistent() method
returns true, and inconsistentThm() returns the
Theorem of FALSE (a proof of the contradiction).
\subsection theory_api_isTerm isTerm()
Tests whether the given expression is a term (as opposed to a formula).
\subsection theory_api_isLiteral isLiteral()
Tests whether the given formula is a literal (is an atomic formula or
its negation).
\subsection theory_api_isAtomic isAtomic()
Tests whether the given term or formula is atomic. In CVC3 it
means that the expression does not contain formulas as subexpressions
(for instance, as conditions in the IF-THEN-ELSE operator).
This method is relatively expensive when called for the first time,
but it caches the result in the Expr attributes, so the amortized
complexity tends to be rather low.
\subsection theory_api_updateHelper updateHelper()
This method replaces all the immediate children of the given
expression by their canonical representatives w.r.t. the union-find
database, and returns the corresponding Theorem e==e1.
This function is convenient to use inside \ref theory_api_update
for rewriting d, the expression being updated. However, it
only works when the changed subexpression in d is its
immediate child.
\subsection theory_api_getType getType()
\subsection theory_api_getTCC getTCC()
\subsection theory_api_parseExpr parseExpr()
\section theory_api_proofs Proof Rules: The Trusted Core
Every proven formula (or fact) in CVC3 appears in the
form of a Theorem. Values of type Theorem have a
special property: they cannot be constructed in any way but through
the proof rules. This is implemented by making all the
constructors of this class private. The only exception is the default
constructor, which creates a null theorem, and it can only be used to
create uninitialized variables of type Theorem, and assign
them later.
A proof rule is a function which takes premises (previously
generated theorems) and other parameters, and generates a new theorem.
The implementation of proof rules comprises the trusted core
of CVC3, and the soundness of the tool relies entirely on the
soundness of this core. In other words, no matter what the bulk of
the code does, if CVC3 derives the validity of a particular fact,
it is guaranteed that that theorem is indeed valid, provided the
trusted implementation is correct and sound.
For this reason, it is prudent to keep the trusted core reasonably
small, and more importantly, keep each proof rule clean and simple, so
that the correctness of the rule itself (mathematically) and its
implementation can be easily verified by manual inspection.
For the same reason, every rule must be thoroughly
documented. It's a very good idea to include a LaTeX formula
for the proof rule that a function implements. Keep in mind that
reverse-engineering the mathematical meaning of a proof rule is a
daunting task, especially if the code is rather long and complex.
Check out src/include/common_proof_rules.h for examples.
\subsection theory_api_proof_classes Hierarchy of Classes
Like anything else in CVC3, there is an API for implementing proof
rules defined by the class TheoremProducer. This class
provides two protected methods, newTheorem() and
newRWTheorem(), to its subclasses, which can create arbitrary
Theorem values. Therefore, a part of the code is considered
trusted whenever the file contains \#include
"theorem_producer.h" statement in it. To enforce this,
theorem_producer.h requires a macro symbol
_CVC3_TRUSTED_ to be defined (otherwise, a compiler warning
is generated).
Important: the _CVC3_TRUSTED_ symbol must be
defined only in *.cpp files, and never in *.h, to
prevent accidental inclusion of theorem_producer.h, and thus,
inadvertently making large portions of code trusted.
Exporting the proof rules to the untrusted code (class
TheoryFoo) is implemented through the custom API in
foo_proof_rules.h (abstract class FooProofRules),
whose pure methods are the proof rules. This header file does not
include theorem_producer.h, and therefore, is suitable for
inclusion by untrusted code.
The implementation of FooProofRules consists of the
implementation API: foo_theorem_producer.h (class
FooTheoremProducer, inherits from FooProofRules and
TheoremProducer), and the implementation proper in
foo_theorem_producer.cpp.
Normally, theorem_producer.h is included from
foo_theorem_producer.h (to declare
FooTheoremProducer as a subclass of
TheoremProducer), and foo_theorem_producer.h is
included from foo_theorem_producer.cpp:
\verbatim
// File foo_proof_rules.h
class FooProofRules {
....
};
\endverbatim
\verbatim
// File foo_theorem_producer.h
#include "theorem_producer.h"
class FooTheoremProducer: public FooProofRules, public TheoremProducer {
....
};
\endverbatim
\verbatim
// File foo_theorem_producer.cpp
#define _CVC3_TRUSTED_
#include "foo_theorem_producer.h"
....
\endverbatim
In order for the theory code to use the proof rules, a pointer to
FooProofRules is declared in the TheoryFoo class:
\verbatim
// File theory_foo.h
class FooProofRules;
class TheoryFoo: public Theory {
......
FooProofRules* d_rules;
//! Create an instance of FooProofRules class
FooProofRules* createProofRules(VCL* vcl);
.....
};
\endverbatim
Since instantiating FooProofRules requires creating a new
object of class FooTheoremProducer, which belongs to trusted
part of the code, the implementation of createProofRules()
method needs to be in foo_theorem_producer.cpp, rather than
in theory_foo.cpp. This is the only exception to the rule
that everything declared in a X.h file must be implemented in
the corresponding X.cpp file in CVC3.
\verbatim
// File foo_theorem_producer.cpp
......
FooProofRules* TheoryFoo::createProofRules(VCL* vcl) {
return new FooTheoremProducer(vcl);
}
.....
\endverbatim
In the TheoryFoo() constructor, the class variable
d_rules is initialized by calling createProofRules():
\verbatim
// File theory_foo.cpp
.....
TheoryFoo::TheoryFoo(VCL* vcl): Theory(vcl, "Foo") {
.....
d_rules = createProofRules(vcl);
.....
}
// Destructor: destroy the proof rules class
TheoryFoo::~TheoryFoo() { delete d_rules; }
.....
\endverbatim
\subsection theory_api_proof_rule Implementing a Proof Rule
Each function implementing a proof rule has several components:
- Soundness check(s),
- The actual computation of the result,
- Building assumptions,
- Building a proof,
- Constructing a new Theorem object.
There is a special macro CHECK_SOUND(cond, message) for
checking the soundness conditions. If the condition cond
does not evaluate to true, then a SoundException is
thrown with the message string.
For efficiency, the user may decide to skip the soundness checks. In
order to honor this decision, all soundness checks must be supressed
when the CHECK_PROOFS macro evaluates to false:
\verbatim
if(CHECK_PROOFS) {
CHECK_SOUND(denominator != 0,
"TheoryArith: Division rule: denominator == 0");
}
\endverbatim
Soundness checks play the most important role in
making CVC3 sound. Your implementation must guarantee that if all
soundness checks pass, then the rule is indeed sound to apply, and the
theorem you generate at the end is indeed a theorem. Soundness checks
include verifying that the premises (Theorems given as arguments) are
of the expected format, and all the additional parameters satisfy all
the side conditions of the proof rule.
Soundness checks must be complete and self-sufficient
(bullet-proof) within the rule; that is, no matter how the rule is
called and with which arguments, there should be no way for the rule
to generate an invalid theorem. Even if the untrusted code which
calls the rule does all the necessary checks, you have to do them
again inside the rule. This is the whole point of the code being
trusted: it cannot go wrong, no matter what happens outside.
In case of CVC3, this means that any non-null
Theorem object represents a valid theorem, no matter
how this theorem was generated.
The main component of a Theorem object is a formula (returned
by Theorem::getExpr()), which is valid in the appropriate
logical context. The logical context is defined by the set
of assumptions carried along in the Theorem object.
Mathematically, a theorem object represents a sequent
\f$\Gamma\vdash\phi\f$, where \f$\Gamma\f$ is the set of assumptions
(formulas assumed to be true), and \f$\phi\f$ is the theorem itself,
the formula which logically follows from \f$\Gamma\f$.
Typically, an inference rule has the following format:
\f[\frac{\Gamma_1\vdash\phi_1\quad\cdots\quad\Gamma_n\vdash\phi_n}
{\Gamma_1,\ldots,\Gamma_n\vdash\psi}\f]
where the assumptions of the conclusion are the union of all the
assumptions from the premises.
The easiest way to compute the set of assumptions for the conclusion
is to use the overloaded method merge() provided by the
theorem.h API. This way you do not have to bother about the
internal representation of assumptions. Keep in mind, that CVC3
may be running in the mode without assumptions (for efficiency), which
can be queried by withAssumptions() method:
\verbatim
Theorem fooRule(const Theorem& prem1, const Theorem& prem2) {
.....
Assumptions a;
if(withAssumptions())
a = merge(prem1, prem2);
....
}
\endverbatim
If the rule accepts more than two premises, you can merge assumptions
by passing the vector of all premises to the merge() method.
When there is only one premis, the simplest way is to make a clean
copy of the assumptions from the premis:
\verbatim
if(withAssumptions())
a = premis.getAssumptionsCopy();
\endverbatim
Occasionally, one needs to remove assumptions from the set, as in the
following rule:
\f[\frac{\Gamma_1\vdash\alpha\quad \Gamma_2, \alpha\vdash\phi}
{\Gamma_1, \Gamma_2\vdash\phi}\mbox{Cut}\f]
In this case, you can use the overloaded operator-() for
class Assumptions:
\verbatim
Theorem cutRule(const Theorem& alpha, const Theorem& phi) {
.....
Assumptions a;
if(withAssumptions())
a = (phi.getAssumptions() - alpha.getExpr()) + alpha.getAssumptions();
....
}
\endverbatim
Remember, however, that due to the internal representation used in CVC3,
removing an assumption is quite expensive, while merging is very
cheap.
Also, if the soundness of your rule relies on the presence (or
absence) of certain assumptions in premises, the first thing you need
to check for is that withAssumptions() returns true
(otherwise there is no way to determine the soundness of the rule, so
it should not be called in the mode without assumptions).
In the above rule, if the assumption \f$\alpha\f$ were required to be
present for soundness of the rule, one could check it as follows:
\verbatim
const Expr& alphaExpr = alpha.getExpr();
const Assumptions& phiAssump = phi.getAssumptionsRef();
if(CHECK_PROOFS) {
CHECK_SOUND(withAssumptions(),
"TheoryFoo::cutRule: called without assumptions!");
CHECK_SOUND(!phiAssump[alphaExpr].isNull(),
"TheoryFoo::cutRule: alpha is not an assumption of phi");
}
\endverbatim
It is extremely important that assumptions are
computed correctly when withAssumptions() returns
true, since assumptions are used by the SAT solver in the
core framework, and are absolutely crucial for the results to be
correct (or sound). Remember, that assumptions represent the
logical context where the theorem is true, and if they are not
computed properly, the entire theorem may become invalid.
Each Theorem object carries a proof of itself in the
form of a Proof object.
Proofs are relatively expensive to generate (they take up extra space
and somewhat slow down the rule execution), and therefore, it is the
user's privilege to turn them off. For this reason, all proof
generation code must be guarded by the method withProof(),
similarly to withAssumptions().
CVC3 uses a version of Natural Deduction as its logical basis, and
exploits the idea of Curry-Howard isomorphism to represent proofs as
terms over function symbols representing proof rules. A
``type'' of a proof term is the formula (theorem) derived by
the corresponding proof. The correctness of a proof in this framework
corresponds to the proof term being ``well-typed.''
The TheoremProducer API provides an overloaded method
newPf() for building proof terms. The first argument of
(almost) any newPf() version is the name of the proof rule
(string), and the rest are the arguments (parameters as
Expr values, and the proofs of the rule's premises).
The key idea in building a proof term for the rule is to provide
enough information in the proof term to be able to re-run the rule
again with exactly the same arguments.
For instance, a proof term for the following rule:
\f[\frac{\Gamma_1\vdash x < y \quad \Gamma_2\vdash y < z}
{\Gamma_1, \Gamma_2\vdash x < z}\mbox{project}
\f]
can be constructed as follows:
\verbatim
Theorem projectRule(const Theorem& xLTy, const Theorem& yLTz) {
.....
Proof pf;
if(withProof()) {
vector exprs;
vector pfs;
exprs.push_back(xLTy.getExpr()); exprs.push_back(yLTz.getExpr());
pfs.push_back(xLTy.getProof()); pfs.push_back(yLTz.getProof());
pf = newPf("project", exprs, pfs);
}
.....
}
\endverbatim
It is a very good idea to describe the proof object arguments in the
proof rule documentation (doxygen comments) in
foo_proof_rules.h file.
Note, that the proof object does not carry around any
information about the assumptions. This is because all the
assumptions are present implicitly as ``types'' of bound proof
variables in LAMBDA-terms. I skip the description of this issue in
the current version of this document, as it is rather subtle, and
there is a 99\% chance that you do not need to know that for your DP.
Finally, creating the resulting theorem in the proof rule is usually
done by calling newTheorem(conclusionExpr, a, pf), where
a is the Assumptions variable, and pf is
the proof term.
There is a special class of proof rules called rewrite rules
in CVC3. These are proof rules without premises (axioms)
whose conclusion is of the form expr1 = expr2 or frm1
<=> frm2. These rules are so ubiquitous in the system that there
is a special optimized constructor for the corresponding theorems:
newRWTheorem(expr1, expr2, a, pf).
Note: the Theorem object constructed by the
rewrite rule has a different internal representation from the "normal"
Theorem object. Therefore, constructing a rewrite theorem
with newTheorem(expr1.eqExpr(expr2), a, pf) will result in a
run-time error. Make sure that if your theorem is a rewrite theorem
(an equality = or equivalence <=>), then it must be
constructed using the newRWTheorem() method.
I did not mention anything about computing the expression for the
conclusion of the rule, but it should be fairly obvious how to do
this. Remember, that each proof rule should be coded as concisely and
as cleanly as possible, to ensure the effectiveness of manual
inspection. Remember, this is a trusted part of the code.
Keep It Simple,
Stupid. :-)
Sergey Berezin / berezin AT stanford DOT edu
*/