My research concerns artificial intelligence approaches to the study and teaching of reasoning, specifically formal, mathematical reasoning.
Logic EducationI am the project manager of the Openproof project at Stanford University's Center for the Study of Language and Information (CSLI). The project has two aims:
Courseware
Our project has been developing courseware packages for undergraduate logic education for more than twenty years. The courseware packages consist of a text book combined with software learning environments used for studying the logic concepts described.
Language, Proof and Logic is the successor to The Language of First-Order Logic. The package aims at teaching both the model and proof theories of first-order logic. The package comes with four applications: Tarski's World, Fitch, Boole and Submit.
Fitch allows users to construct proofs in a natural deduction formalism. The program is able to verify that proofs are correctly constructed, and that they verify required consequences. In addition to the inference rules of Fitch are three procedures for testing for propositional, first-order and blocks world consequence. This allows users to verify a consequence before attempting to build a proof. Fitch may apply constraints to proofs, indicating that users may use only certain inference rules and/or apply them in only limited ways.
Boole is designed to allow users to construct truth tables for sentences. The user enters a sentence, or sentences, and then constructs a column for each atomic subformula of the sentences. These reference columns are filled in with truth values, and then the user may assign truth values to all the subformulae of the sentences, including eventually the sentences themselves for each combination. Finally he user may make judgments about the sentences (E.g., that it is a tautology) or relationships between sentences (E.g., they are propositionally equivalent.). the correctness and completeness of the truth tables constructed in Boole are automatically checked, as are the judgments.
Submit is quite a different application. Submit is used to send work completed using the other applications in response to exercises in the textbook to our automated grading service. The Grade Grinder accepts this work and assesses it, returning a report on the work by email to the submitter, and optionally a specified instructor.
Hyperproof teaches heterogeneous reasoning within the blocks world. By heterogeneous reasoning, we mean that the premises and conclusions or the reasoning may be expressed in different representations, say diagrams and sentences, for example. We believe that this, not the formal sentence-based reasoning of mathematicians and logicians is the typical case. When we determine a route to a party by looking at an address and a street map, or read data from a bar graph, or examine the results from a spreadsheet, we are performing heterogeneous reasoning.
The Hyperproof application embodies an instance of heterogeneous reasoning, namely that involving reasoning with first-order sentence and blocks world diagrams. As in Fitch users are required to produce proofs of various facts. However, the givens of the problem may include a diagram which expresses given information just as the sentences do. The inference rules of the Hyperproof logic include the standard first-order inference rules, and additional rules which enable the extraction of sententially expressed information from the diagram, and the addition of features to the diagram on the basis of sententially expressed information.
The Openproof project is continuing the work on heterogeneous reasoning that we began with the implementation of Hyperproof. We are currently developing an architecture for heterogeneous reasoning called Openproof. The idea of this architecture is to enable the building of heterogeneous reasoning environments in a variety of domains. Openproof is a computational environment with a plug-in architecture. The plug-ins will implement particular representation systems, together with their inference system. One such plug-in might implement first-order logic, another Venn diagrams, and so on. By building such plug-ins and configuring Openproof to use them in a variety of combinations, a range of heterogeneous reasoning environments might be easily constructed. Thus Openproof generalizes Hyperproof in architecture, and allows the construction of Hyperproof as the combination of two plug-ins: for the blocks world and first-order logic.
We plan to use the Openproof architecture to construct a new learning environment, called Playfair, which will be included in our next courseware package. Playfair will include plug-ins for the blocks world, first-order logic, Venn diagrams, state machine diagrams, and Hasse diagrams .
The Role of Diagrams in Mathematics&/GROVER is the name of a pair of programs aimed at investigating the role of diagrams in mathematical proofs. Despite the widespread prejudice against the use of diagrams in mathematical proofs, we observed that many mathematics books make use of diagrams. Our goal in this project was to consider the uses of these diagrams, and to exploit diagrams in the search for mathematical proofs. The idea here is this. Even if the diagrams are used only to ``guide the intuition'' of the reader, could they not be used to guide an automated reasoning program in the same way?
GROVER is the name of an expert system which accepts as input a mathematical diagram and a conjecture, and produces guidance for an automated reasoning system about how to prove the conjecture. The idea is that in some cases, a mathematical diagram can be interpreted as representing strategic information about how to complete a proof.
& (Automated Natural Deduction) is the name of our underlying automated reasoning system. & implements a natural deduction system for reasoning in Zermelo set theory. & is a somewhat conventional natural deduction theorem prover, with the addition of some heuristics for controlling search and some inference rules specialized for reasoning in Zermelo set theory. These allow & (Automated Natural Deduction) to prove, among other results, Cantor's theorem.
Together &/GROVER are able to prove some non-trivial theorems in mathematics, including the Schroeder-Bernstein theorem, and the diamond lemma. Significantly, & alone (i.e. without the guidance obtained from the diagram via GROVER) is unable to prove those theorems automatically.
Definition and Lemma Use in MathematicsOne feature of mathematical reasoning is that mathematicians frequently refer back to previously proved results (lemmas) in the proof of a new result. Mathematicians work in a rich context of existing mathematics as they produce new results -- this is true for students as well as research mathematicians.
A challenge for the designers of automated reasoning systems, systems which are to find proofs without human intervention, is how to intelligently use this context. Some automated reasoning systems require that users specify the lemmas (and/or definitions) that are to be used in a proof, while others use none, and re-derive results from first principles within a new proof. Both approaches are unsatisfactory for obvious reasons.
Gazing is a technique that I developed for addressing this issue. The lemmas and definitions available in a mathematical context is represented in a hierarchy of abstraction spaces. When the automated reasoning system recognizes that it needs to access the mathematical theory, it makes a plan in these abstraction spaces to determine which facts to use and how. Initially it considers only the most abstract representation of the facts, and chooses those that it thinks relevant to the problem at hand. It then examines these choices at a more detailed level of abstraction, and possibly discards some facts, repeating with increasing detail until arriving at a specific choice of lemmas. These are then used and the proof search continues. By the use of abstraction, the complexity of the underlying search space is tamed, allowing the system to consider only those facts that it has reason to believe that it needs.
Gazing is implemented as part of the Gazer theorem prover.