X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2Fpapers%2Fcalculemus-2003%2Fhbugs-calculemus-2003.tex;h=431fbb9b716186c01bbf51f162fc02f9c74653c3;hb=4167cea65ca58897d1a3dbb81ff95de5074700cc;hp=f3fee974894bb265d7165f24ad065e0d6fb36f5b;hpb=ae138b9558f6d0f2904b906aa08e633f56114fa7;p=helm.git diff --git a/helm/papers/calculemus-2003/hbugs-calculemus-2003.tex b/helm/papers/calculemus-2003/hbugs-calculemus-2003.tex index f3fee9748..431fbb9b7 100644 --- a/helm/papers/calculemus-2003/hbugs-calculemus-2003.tex +++ b/helm/papers/calculemus-2003/hbugs-calculemus-2003.tex @@ -8,29 +8,35 @@ \newcommand{\myincludegraphics}[5]{ \begin{figure}[#2] \begin{center} - \includegraphics[width=#3]{XFIG-EPS/#1.eps} + \includegraphics[width=#3]{eps/#1.eps} \caption[#4]{#5} \label{#1} \end{center} \end{figure} } -\usepackage[show]{ed} -\usepackage{draftstamp} +%\usepackage[show]{ed} +%\usepackage{draftstamp} +\newcommand{\musing}{\texttt{musing}} +\newcommand{\musings}{\texttt{musings}} \newcommand{\ws}{Web-Service} \newcommand{\wss}{Web-Services} \newcommand{\hbugs}{H-Bugs} \newcommand{\helm}{HELM} +\newcommand{\Omegapp}{$\Omega$mega} +\newcommand{\OmegaAnts}{$\Omega$mega-Ants} \title{Brokers and Web-Services for Automatic Deduction: a Case Study} -\author{Claudio Sacerdoti Coen \and Stefano Zacchiroli} +\author{ + Claudio Sacerdoti Coen\thanks{Partially supported by `MoWGLI: Math on the Web, Get it by Logic and Interfaces', EU IST-2001-33562} \and + Stefano Zacchiroli\thanks{Partially supported by `MyThS: Models and Types for Security in Mobile Distributed Systems', EU FET-GC IST-2001-32617}} \institute{ Department of Computer Science\\ University of Bologna\\ - Via di Mura Anteo Zamboni 7, 40127 Bologna, ITALY\\ + Mura Anteo Zamboni 7, 40127 Bologna, ITALY\\ \email{sacerdot@cs.unibo.it} \and Department of Computer Science\\ @@ -47,11 +53,11 @@ \begin{abstract} We present a planning broker and several Web-Services for automatic deduction. - Each Web-Service implements one of the tactics usually available in an - interactive proof-assistant. When the broker is submitted a "proof status" (an - unfinished proof tree and a focus on an open goal) it dispatches the proof to - the Web-Services, collects the successfull results, and send them back to the - client as "hints" as soon as they are available. + Each Web-Service implements one of the tactics usually available in + interactive proof-assistants. When the broker is submitted a ``proof status'' + (an incomplete proof tree and a focus on an open goal) it dispatches the proof + to the Web-Services, collects the successful results, and send them back to + the client as ``hints'' as soon as they are available. In our experience this architecture turns out to be helpful both for experienced users (who can take benefit of distributing heavy computations) @@ -59,1007 +65,808 @@ \end{abstract} \section{Introduction} - The \ws\ approach at software development seems to be a working solution for + The \ws{} approach at software development seems to be a working solution for getting rid of a wide range of incompatibilities between communicating software applications. W3C's efforts in standardizing related technologies - grant longevity and implementations availability for frameworks based on \wss\ - for information exchange. As a direct conseguence, the number of such + grant longevity and implementations availability for frameworks based on + \wss{} for information exchange. As a direct consequence, the number of such frameworks is increasing and the World Wide Web is moving from a disorganized repository of human-understandable HTML documents to a disorganized repository of applications working on machine-understandable XML documents both for input and output. -\oldpart -{Di effetto la prima fase. La seconda era troppo fuori-tema e dettagliata per - un abstract} -{ - This lack of organization along with the huge amount of documents available, - have posed not a few problems for data location. If this limitation could be - overcome using heuristic-based search engine for HTML pages, a satisfying - solution has yet to be found for \ws\ location \ednote{location non mi piace, ma - "localization" e' anche peggio, "retrieval"?}. A promising architecture seems - to be the use of \emph{brokers}, which access point (usually an URI) is known - by the potential clients or by \emph{registries}, to which the \wss\ subscribe - to publish their functionalities. -} -\begin{newpart}{Questa \'e la mia contro-proposta. Nella mia idea originale -dovrebbe essere una versione pi\'u fluida e leggera. Boh.} -The big challenge for the next future is to provide stable and reliable -services over this disorganized, unreliable and ever-evolving architecture. -The standard solution is providing a further level of stable services -(called \emph{brokers}) that behave as common gateway/address for client -applications to access a wide variety of services and abstract over them. - -Since the \emph{Declaration of Linz}, the MONET Consortium \cite{MONET}, -following the guidelines of the \wss/brokers approach, is -\end{newpart} - working on the development of a framework aimed at providing a set of software - tools for the advertisement and discovering of mathematical web services. -%CSC This framework turns out to be strongly based on both \wss\ and brokers. -\oldpart -{Hhhmm. Chiaro il punto, ma CAS and Theorem Prover related web-services non - \'e molto preciso.} -{ - Several examples of CAS (Computer Algebra System) and Theorem Prover related - \wss\ have been shown to be useful and to fit well in the MONET Architecture - \ednote{citarne qualcuno: CSC???}. On the other hand \ednote{troppo informale?} - the set of \wss\ related to Proof Assistants tends to be ... rather ... - empty!\ednote{gia' mi immagino il tuo commento: BUHM! :-)} -} -\begin{newpart}{riformulo} - Several groups have already developed \wss\ providing both computational - and reasoning capabilities \cite{???,???,???}\ednote{trovare dei puntatori - carini dalle conferenze calculemus}: the first ones are implemented - on top of Computer Algebra Systems; the last ones provide interfaces to - well-known theorem provers. Proof-planners, proof-assistants, CAS and - domain-specific problem solvers are natural candidates to be client - of these services. Nevertheless, so far the number of examples in the - literature has been extremely low and the concrete benefits are still - to be assessed.\ednote{A questo punto bisogna venire al - sodo prima che il lettore passi al caff\'e} -\end{newpart} - - In this paper we present an architecture, namely \hbugs\, implementing a + The big challenge for the next future is to provide stable and reliable + services over this disorganized, unreliable, and ever-evolving architecture. + The standard solution is to provide a further level of stable services (called + \emph{brokers}) that behave as common gateways/addresses for client + applications to access a wide variety of services and abstract over them. + + Since the \emph{Declaration of Linz}, the MONET + Consortium\footnote{\url{http://monet.nag.co.uk/cocoon/monet/index.html}} + is working on the development of a framework, based on the + \wss{}/brokers approach, aimed at providing a set of software tools for the + advertisement and the discovery of mathematical \wss{}. + %CSC This framework turns out to be strongly based on both \wss{} and brokers. + + Several groups have already developed software bus and + services\footnote{The most part of these systems predate the development of + \wss. Those systems whose development is still active are slowly being + reimplemented as \wss.} providing both computational and reasoning + capabilities \cite{ws1,ws2,ws3,ws4}: the first ones are implemented on top of + Computer Algebra Systems; the second ones provide interfaces to well-known + theorem provers. + Proof-planners, proof-assistants, CASs and + domain-specific problem solvers are natural candidates to be clients of these + services. Nevertheless, so far the number of examples in the literature has + been insufficient to fully assess the concrete benefits of the framework. + + In this paper we present an architecture, namely \hbugs{}, implementing a \emph{suggestion engine} for the proof assistant developed on behalf of the - \helm\ project. We provide several \wss (called \emph{tutors}) able to + \helm{}\footnote{Hypertextual Electronic Library of Mathematics, + \url{http://helm.cs.unibo.it}} project + \cite{helm}. We provide several \wss{} (called \emph{tutors}) able to suggest possible ways to proceed in a proof. The tutors are orchestrated - by a broker (a \ws\ itself) that is able to distribute a proof + by a broker (a \ws{} itself) that is able to dispatch a proof status from a client (the proof-assistant) to the tutors; - each tutor try to make progress in the proof and, in case - of success, notify the client that shows and \emph{hint} to the user. - Both the broker and the tutors are instances of the homonymous entities of - the MONET framework. - - A precursor of \hbugs is the $\Omega$-Ants project \cite{???}, - which provided similar functionalities to the - Omega proof-planner \cite{Omega}. The main architectural difference - between \hbugs and $\Omega$-Ants is that the latter is based on a - black-board architecture and it is not implemented using \wss and - brokers. Other differences will be detailed in Sect. \ref{conclusions}. - - In Sect. \ref{architecture} we present the architecture of \hbugs. + each tutor tries to make progress in the proof and, in case + of success, notifies the client that shows an \emph{hint} to the user. + The broker is an instance of the homonymous entity of the MONET framework. + The tutors are MONET services. Another \ws{} (which is not described in this + paper and which is called Getter \cite{zack}) is used to locate and download + mathematical entities; the Getter plays the role of the Mathematical Object + Manager of the MONET framework. + + A precursor of \hbugs{} is the \OmegaAnts{} project + \cite{omegaants1,omegaants2}, which provided similar functionalities to the + \Omegapp{} proof-planner \cite{omega}. The main architectural difference + between \hbugs{} and \OmegaAnts{} is that the latter is based on a + black-board architecture and it is not implemented using \wss{} and + brokers. + + In Sect. \ref{architecture} we present the architecture of \hbugs{}. + A usage session is shown in Sect. \ref{usage}. Further implementation details are given in Sect. \ref{implementation}. Sect. \ref{tutors} is an overview of the tutors that have been implemented. - Finally, the last section is left for the conclusions and future works. + As usual, the final section of this paper is devoted to conclusions and future works. -\oldpart -{Non so se/dove mettere queste parti} -{ - Despite of that the proof assistant case seems to be well suited to - investigate the usage of many different mathematical \wss. Indeed: most proof - assistants are still based on non-client/server architectures, are - application-centric instead of document-centric, offer a scarce level of - automation leaving entirely to the user the choice of which macro (usually - called \emph{tactic}) to use in order to make progress in a proof. - - The average proof assistant can be, for example, a client of a \ws\ - interfacing a specific or generic purpose theorem prover, or a client of a - \ws\ interfacing a CAS to simplify expressions in a particular mathematical - domain. -} +\section{An \hbugs{} Bird's Eye View} +\label{architecture} + \myincludegraphics{arch}{t}{8cm}{\hbugs{} architecture}{\hbugs{} architecture} + + The \hbugs{} architecture (depicted in Fig. \ref{arch}) is based on three + different kinds of actors: \emph{clients}, \emph{brokers}, and \emph{tutors}. + Each actor presents one or more \ws{} interfaces to its neighbors \hbugs{} + actors. + + In this section we detail the role and requirements of each kind of + actors and we discuss about the correspondences between them and the MONET + entities described in \cite{MONET-Overview}. + Due to lack of space, we cannot compare our framework to similar proposals, as + the older and more advanced \Omegapp{} system. The study of the + correspondences with MONET is well motivated by the fact that the MONET + framework is still under development and that our implementation is one of the + first experiments in \ws based distributed reasoning. On the other hand, a + comparison with \Omegapp{} would be less interesting since the functionalities we + provide so far are just a subset of the \OmegaAnts{} ones. + + \paragraph{Clients} + An \hbugs{} client is a software component able to produce \emph{proof + status} and to consume \emph{hints}. + + A proof status is a representation of an incomplete proof and is supposed to + be informative enough to be used by an interactive proof assistant. No + additional requirements exist on the proof status, but there should be an + agreement on its format between clients and tutors. A hint is an + encoding of a step that can be performed in order to proceed in an + incomplete proof. Usually it represents a reference to a tactic available + on some proof assistant along with an instantiation for its formal + parameters. Hints can also be more structured: a hint can be + as complex as a whole proof-plan. + + Using W3C's terminology \cite{ws-glossary}, clients act both as \ws{} + providers and requesters, see Fig. \ref{interfaces}. + They act as providers receiving hints from the broker; they act as + requesters submitting new status to the tutors. + Clients additionally use broker services to know which tutors are available + and to subscribe to one or more of them. + + Usually, when the client role is taken by an interactive proof assistant, + new status are sent to the broker as soon as the proof change (e.g. when the + user applies a tactic or when a new proof is started); hints are shown to + the user by the means of some effects in the user interface (e.g. popping a + dialog box or enlightening a tactic button). + + \hbugs{} clients act as MONET clients and ask brokers to provide access to a + set of services (the tutors). \hbugs{} has no actors corresponding to + MONET's Broker Locating Service (since the client is supposed to know the + URI of at least one broker). The \hbugs{} clients and tutors contact the + Getter (a MONET Mathematical Object Manager) to locate and retrieve + mathematical items from the \helm{} library. + The proof status that are exchanged + by the \hbugs{} actors, instead, are built on the fly and are neither + stored nor given an unique identifier (URI) to be managed by the + Getter. + + \paragraph{Brokers} + \myincludegraphics{interfaces}{t!}{10cm}{\hbugs{} \wss{} interfaces} + {\hbugs{} \wss{} interfaces} + + Brokers are the key actors of the \hbugs{} architecture since they + act as intermediaries between clients and tutors. They behave as \wss{} + providers and requesters for \emph{both} clients and tutors, see Fig. + \ref{interfaces}. + + With respect to the client, a broker acts as a \ws{} provider, receiving the + proof status and forwarding it to one or more tutors. + It also acts as a \ws{} requester sending + hints to the client as soon as they are available from the tutors. + + With respect to the tutors, the \ws{} provider role is accomplished by + receiving hints as soon as they are produced; as a requester, it is + accomplished by asking for computations (\emph{musings} in \hbugs{} + terminology) on status received by clients and by stopping already late but + still ongoing \musings{}. + + Additionally brokers keep track of available tutors and clients + subscriptions. + + \hbugs{} brokers act as MONET brokers implementing the following components: + Client Manager, Service Registry Manager (keeping track of available + tutors), Planning Manager (choosing the available tutors among the ones to + which the client is subscribed), Execution Manager. The Service Manager + component is not required since the session handler, that identifies + a session between a service and a broker, is provided to the service by + the broker instead of being received from the service when the session is + initialized. In particular, a session is identified by an unique identifier + for the client (its URL) and an unique identifier for the broker (its + URL). + + Notice that \hbugs{} brokers have no knowledge of the domain area of + proof-assistants, nor they are able to interpret the messages that they + are forwarding. They are indeed only in charge of maintaining the + abstraction of several reasoning blackboards --- one for each client --- + of capacity one: a blackboard is created when the client submits a problem; + it is then ``shared'' by the client and all the tutors until the client + submits the next problem. For instance, replacing the client with a CAS and + all the tutors with agents implementing different resolution methods for + differential equations would not require any change in the broker. Notice + that all the tutors must expose the same interface to the broker. + + The MONET architecture specification does not state explicitly whether the + service and broker answers can be asynchronous. Nevertheless, the + described information flow implicitly suggests a synchronous implementation. + On the contrary, in \hbugs{} every request is asynchronous: the connection + used by an actor to issue a query is immediately closed; when a service + produces an answer, it gives it back to the issuer by calling the + appropriate actor's method. + + \paragraph{Tutors} + Tutors are software components able to consume proof status producing hints. + \hbugs{} does not specify by which means hints should be produced: tutors + can use any means necessary (heuristics, external theorem prover or CAS, + etc.). The only requirement is that there exists an agreement on the formats + of proof status and hints. + + Tutors act both as \ws{} providers and requesters for the broker, see Fig. + \ref{interfaces}. As + providers, they wait for commands requesting to start a new \musing{} on + a given proof status or to stop an old, out of date, \musing{}. As + requesters, they signal to the broker the end of a \musing{} along with its + outcome (a hint in case of success or a failure notification). + + \hbugs{} tutors act as MONET services. + +\section{An \hbugs{} Session Example} +\label{usage} +In this section we describe a typical \hbugs{} session. The aim of the +session is to solve the following easy exercise: +\begin{exercise} +Let $x$ be a generic real number. Using the \helm{} proof-engine, +prove that +\begin{displaymath} +x = \frac{(x+1)*(x+1) - 1 - x*x}{2} +\end{displaymath} +\end{exercise} + +Let us suppose that the \hbugs{} broker is already running and that the +tutors already registered themselves to the broker. +When the user starts our proof-engine \texttt{gTopLevel}, the system registers itself to +the broker, that sends back the list of available tutors. By default, +\texttt{gTopLevel} notifies to the broker its intention of subscribing to every +tutor available. The user can always open a configuration window where she +is presented the list of available tutors and she can independently subscribe +and unsubscribe herself to each tutor. + +\myincludegraphics{step1}{t}{12cm}{Example session.} + {Example session.} +%\myincludegraphics{step2}{t}{4cm}{Example session, snapshot 2.} +% {Example session, snapshot 2.} + +The user can now insert into the system the statement of the theorem and start +proving it. Let us suppose that the first step of the user is proving +that the denominator 2 is different from 0. Once that this technical result +is proven, the user must prove the goal shown in the upper right corner +of the window in background in Fig. \ref{step1}. + +While the user is wondering how to proceed in the proof, the tutors are +trying to progress in the proof. After a while, the tutors' suggestions +start to appear in the lower part of the \hbugs{} interface window +(the topmost window in Fig. \ref{step1}). In this case, the tutors are able +to produce 23 hints. The first and not very useful hint suggests to proceed in +the proof by exchanging the two sides of the equality. +The second hint suggests to reduce both sides of the equality to their normal +form by using only reductions which are justified by the ring structure of the +real numbers; the two normal forms, though, are so different that the proof is +not really simplified. +All the residual 21 hints suggest to apply one lemma from the distributed +library of \helm{}. The user can look at the statement of any lemma by clicking +on its URI. + +The user can now look at the list of suggestions and realize that a good one is +applying the lemma \texttt{r\_Rmult\_mult} that allows to multiply both equality +members by the same scalar\footnote{Even if she does not receive the hint, the +user probably already knows that this is the right way to proceed. The +difficult part, accomplished by the hint, is guessing what is the name of the +lemma to apply.}. +Double-clicking on the hint automatically applies +the lemma, reducing the proof to closing three new goals. The first one asks +the user the scalar to use as an argument of the previous lemma; the second +one states that the scalar is different from 0; the third lemma (the main +one) asks to prove the equality between the two new members. +% is shown in Fig. \ref{step2} where $?_3[H;x]$ stands for +% the still unknown scalar argument, which can have only $H$ and $x$ as +% free variables. + +The user proceeds by instantiating the scalar with the number 2. The +\texttt{Assumption} tutor now suggests to close the second goal (that +states that $2 \neq 0$) by applying the hypothesis $H$. +No useful suggestions, instead, are generated for the main goal +$2*x = 2*((x+1)*(x+1)-1-x*x)*2^{-1}$. +To proceed in the proof the user needs to simplify the +expression using the lemma $Rinv\_r\_simpl\_m$ that states that +$\forall x,y.\;y = x * y * x^{-1}$. Since we do not provide yet any tutor +suggesting simplifications, the user must find out this simplification by +himself. Once she founds it, the goal is reduced to proving that +$2*x = (x+1)*(x+1) - 1 - x*x$. This equality is easily solved by the +\texttt{Ring} tutor, that suggests\footnote{The \texttt{Ring} suggestion is +just one of the 22 hints that the user receives. It is the only hint that +does not open new goals, but the user right now does not have any way to know +that.} to the user how to complete the proof in one macrostep. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% Comandi da dare a gTopLevel % +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% New proof: +% !x.(not (eqT ? (Rplus R1 R1) R0)) -> (eqT ? x (Rdiv (Rminus (Rminus (Rmult (Rplus x R1) (Rplus x R1)) R1) (Rmult x x)) (Rplus R1 R1))) +% Intros x H +% Apply r_Rmult_mult +% 3: Apply H +% Simpl (per fare unfold di Rdiv) +% Rewrite <- +% (Rmult_assoc (Rplus R1 R1) (Rplus (Rplus (Rmult (Rplus x R1) (Rplus x R1)) (Ropp R1)) (Ropp (Rmult x x))) (Rinv (Rplus R1 R1))) +% Rewrite -> +% (Rinv_r_simpl_m (Rplus R1 R1) (Rplus (Rplus (Rmult (Rplus x R1) (Rplus x R1)) (Ropp R1)) (Ropp (Rmult x x))) H) +% *** Ring +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\section{Implementation's Highlights} +\label{implementation} +In this section we present some of the most relevant implementation details of +the \hbugs{} architecture. + + + \paragraph{Proof status} + In our implementation of the \hbugs{} architecture we used the proof + assistant of the \helm{} project (codename \texttt{gTopLevel}) as an \hbugs{} + client. Thus we have implemented serialization/deserialization capabilities + for its internal status. In order to be able to describe \wss{} that + exchange status in WSDL using the XML Schema type system, we have chosen an + XML format as the target format for the serialization. + +% A schematic representation of the gTopLevel internal status is depicted in +% Fig. \ref{status}. + Each proof is represented by a tuple of four elements: + \emph{uri}, \emph{metasenv}, \emph{proof}, \emph{thesis}. + +% \myincludegraphics{status}{t}{8cm}{gTopLevel proof status}{gTopLevel proof +% status} + + \begin{description} + \item[uri]: an URI chosen by the user at the beginning of the proof + process. Once (and if) proved, that URI will globally identify the term + inside the \helm{} library (given that the user decides to save it). + \item[thesis]: the ongoing proof thesis + \item[proof]: the current incomplete proof tree. It can contain + \emph{metavariables} (holes) that stands for the parts of the proof + that are still to be completed. Each metavariable appearing in the + tree references one element of the metavariables environment + (\emph{metasenv}). + \item[metasenv]: the metavariables environment is a list of + \emph{goals} (unproved conjectures). + In order to complete the proof, the user has to instantiate every + metavariable in the proof with a proof of the corresponding goal. + Each goal is identified by an unique identifier and has a context + and a type (the goal thesis). The context is a list of named + hypotheses (declarations and definitions). Thus the context and the goal + thesis form a sequent, which is the statement of the proof that will + be used to instantiate the metavariable occurrences. + \end{description} + + Each of these information is represented in XML as described in + \cite{mowglicic}. Additionally, an \hbugs{} status carries the unique + identifier of the current goal, which is the goal the user is currently + focused on. Using this value it is possible to implement different client + side strategies: the user could ask the tutors to work on the goal + she is considering or to work on the other ``background'' goals. + + \paragraph{Hints} + A hint in the \hbugs{} architecture should carry enough information to + permit the client to progress in the current proof. In our + implementation each hint corresponds to either one of the tactics available + to the user in gTopLevel (together with its actual arguments) or a set + of alternative suggestions (a list of hints). + + For tactics that do not require any particular argument (like tactics that + apply type constructors or decision procedures) + only the tactic name is represented in the hint. For tactics that need + terms as arguments (for example the \texttt{Apply} tactic that apply a + given lemma) the hint includes a textual representation of them, using the + same representation used by the interactive proof assistant when querying + user for terms. In order to be transmitted between \wss{}, hints are + serialized in XML. + It is also possible for a tutor to return more hints at once, + grouping them in a particular XML element. + This feature turns out to be particularly useful for the + \emph{searchPatternApply} tutor (see Sect. \ref{tutors}) that + queries a lemma database and returns to the client a list of all lemmas that + could be used to complete the proof. This particular hint is encoded as a + list of \texttt{Apply} hints, each of them having one of the results as term + argument. -% \section{Introduction} -% Since the development of the first proof-assistants based on the -% Curry-Howard isomorphism, it became clear that directly writing -% lambda-terms (henceforth called simply terms) was a difficult, repetitive, -% time-expensive and error prone activity; hence the introduction of -% meta-languages to describe procedures that are able to automatically -% generate the low-level terms. -% Nowadays, almost all the proof-assistants -% using terms description of proofs have many levels of abstractions, i.e. -% meta-languages, to create the terms, with the only remarkable exception -% of the ALF family \cite{ALF} in which terms are still directly written -% without any over-standing level of abstraction. -% -% In particular, there are -% usually at least two levels, that of tactics and that of the language in -% which the whole system and hence the tactics are written; once the tactics -% are implemented, to do a proof the user enters to the system a sequence of -% tactics, called a script; the script is then executed to create the term -% encoding of the proof, which is type-checked by the kernel of the -% proof-assistant. Writing a script interactively is much simpler than -% writing the term by hands; once written, though, it becomes impossible -% to understand a script without replaying it interactively. For this reason, -% we can hardly speak of the language of tactics as a high level language, -% even if it is ``compiled'' to the language of terms. -% -% To avoid confusion, in the rest of this paper we will -% avoid the use of the term ``proof'', using only ``script'' and ``term''; -% moreover, we will avoid also the terms ``proof-checking'' and ``type-checking'', -% replacing them with ``retyping'' to mean the type-checking of an already -% generated term; finally, we will use the term ``compiling'' to mean the -% execution of a script. Usually, compilation ends with -% the type-checking of the generated term; hence the choice of ``retyping'' for -% the type-checking of an already generated term. -% -% A long term consequence of the introduction of tactics has been the -% progressive lowering of interest in terms. In particular, users of -% modern proof-assistants as Coq \cite{Coq} may even ignore the existence of -% commands to show the terms generated by their scripts. These terms are -% usually very huge and quite unreadable, so they don't add any easily -% accessible information to the scripts. Hence implementors have loosed -% interest in terms too, as far as their size and compilation time are -% small enough to make the system response-time acceptable. -% When this is not the case, it is sometimes possible to trade space -% with time using reflexive tactics, in which the potentiality of complex -% type-systems to speak about themselves are exploited: reflexive tactics -% usually leads to a polynomial asympotical reduction in the size of the -% terms, at the cost of an increased reduction time during retyping and -% compilation. Often, though, reflexive tactics could not be used; moreover, -% reflexive tactics suffer of the same implementative problems of other tactics -% and so could be implemented in such a way to create huger than -% needed terms, even if asymptotically smaller than the best ones created -% without reflection. -% -% The low level of interest in terms of the implementors of -% tactics often leads to naive implementations (``If it works, it is OK'') and -% this to low-quality terms, which: -% \begin{enumerate} -% \item are huger than needed because particular cases are not taken -% into account during tactic development -% \item require more time than needed for retyping due to -% their size -% \item are particularly unreadable because they don't -% correspond to the ``natural'' way of writing the proof by hand -% \end{enumerate} -% To cope with the second problem, retyping is usually avoided allowing -% systems to reload saved terms without retyping them and using a -% checksum to ensure that the saved file has not been modified. This is -% perfectly reasonable accordingly to the traditional application-centric -% architecture of proof-assistants in which you have only one tool integrating -% all the functionalities and so you are free to use a proprietary format for data -% representation. -% -% In the last months, though, an ever increasing number of people and projects -% (see, for example, HELM \cite{EHELM}, MathWeb \cite{MATHWEB} and -% Formavie \cite{FORMAVIE}) -% have been interested to switch from the application-centric model to the -% newer content-centric one, in which information is stored in -% standard formats (that is, XML based) to allow different applications to work -% on the same set of data. As a consequence, term size really becomes an -% important issue, due to the redundancy of standard formats, and retyping -% is needed because the applications can not trust each other, hence needing -% retyping and making retyping time critical. -% Moreover, as showed by Yann Coscoy in its PhD. -% thesis \cite{YANNTHESIS} or by the Alfa interface to the Agda system -% \cite{ALFA}, it is perfectly reasonable and also feasible to try -% to produce descriptions in natural languages of formal proofs encoded as -% terms. -% This approach, combined with the further possibility of applying the usual -% two-dimensional mathematic notation to the formulas that appears in the -% terms, is being followed by projects HELM \cite{HELM}, PCOQ \cite{PCOQ} and -% MathWeb \cite{MATHWEB} with promising results. It must be understood, though, -% that the quality (in terms of naturality and readability) of this kind of -% proofs rendering heavily depends on the quality of terms, making also -% the third characteristic of low-quality terms a critical issue. -% -% A totally different scenario in which term size and retyping time are -% critical is the one introduced by Necula and Lee \cite{Necula} under -% the name Proof Carrying Code (PCC). PCC is a technique that can be used -% for safe execution of untrusted code. In a typical instance of PCC, a code -% receiver establishes a set of safety rules that guarantee safe behavior -% of programs, and the code producer creates a formal safety proof that -% proves, for the untrusted code, adherence to the safety rules. Then, the -% proof is transmitted to the receiver together with the code and it is -% retyped before code execution. While very compact representation of the -% terms, highly based on type-inference and unification, could be used -% to reduce the size and retyping time \cite{Necula2}, designing proof-assistants -% to produce terms characterized by an high level of quality is still necessary. -% -% In the next section we introduce a particular class of metrics for -% tactics evaluation. In section \ref{equivalence} we consider the -% notion of tactics equivalence and we describe one of the bad habits -% of tactics implementors, which is overkilling; we also provide and analyze -% a simple example of overkilling tactic. In the last section we describe -% a concrete experience of fixing overkilling in the implementation of a -% reflexive tactic in system Coq and we analyze the gain with respect to -% term-size, retyping time and term readability. -% -% \section{From Metrics for Terms Evaluation to Metrics for Tactics Evaluation} -% The aim of this section is to show how metrics for term evaluation could -% induce metrics for tactic evaluation. Roughly speaking, this allows us to -% valuate tactics in terms of the quality of the terms produced. Even if -% we think that these kinds of metrics are interesting and worth studying, -% it must be understood that many other valuable forms of metrics could be -% defined on tactics, depending on what we are interested in. For example, -% we could be interested on compilation time, that is the sum of the time -% required to generate the term and the retyping time for it. Clearly, only -% the second component could be measured with a term metric and a good -% implementation of a tactic with respect to the metric considered could -% effectively decide to sacrifice term quality (and hence retyping time) to -% minimize the time spent to generate the term. The situation is very close to -% the one already encountered in compilers implementation, where there is -% always a compromise, usually user configurable, between minimizing compiling -% time and maximizing code quality. -% -% The section is organized as follows: first we recall the definition of -% tactic and we introduce metrics on terms; then we give the definition -% of some metrics induced by term metrics on tactics. -% -% \begin{definition}[Of tactic] -% We define a \emph{tactic} as a function that, given a goal $G$ (that is, a local -% context plus a type to inhabit) that satisfies a list of assumptions -% (preconditions) $P$, -% returns a couple $(L,t)$ where -% $L = {L_1,\ldots,L_n}$ is a (possible empty) list of proof-obligations -% (i.e. goals) and $t$ is a function that, given a list $l = {l_1,\ldots,l_n}$ -% of terms such that $l_i$ inhabits\footnote{We say, with a small abuse of -% language, that a term $t$ inhabits a goal $G=(\Gamma,T)$ when $t$ is of type -% $T$ in the context $\Gamma$.} $L_i$ for each $i$ in ${1,\ldots,n}$, returns a -% term $t(l)$ inhabiting $G$. -% \end{definition} -% -% % You can get another looser definition of tactic just saying that a tactic -% % is a partial function instead than a total one. In this paper when referring -% % to tactics we always refer to the first definition. -% -% \begin{definition}[Of term metric] -% For any goal $G$, a term metric $\mu_G$ is any function in -% $\mathbb{N}^{\{t/t\;\mbox{inhabits}\;G\}}$. -% Two important class of term metrics are functional metrics and monotone -% metrics: -% \begin{enumerate} -% \item {\bf Functional metrics:} a metric $\mu_G$ is \emph{functional} -% if for each term context (=term with an hole) $C[]$ and for all terms -% $t_1$,$t_2$ if $\mu_G(t_1)=\mu_G(t_2)$ then $\mu_G(C[t_1])=\mu_G(C[t_2])$. -% An equivalent definition is that a metric $\mu_G$ is functional -% if for each term context $C[]$ the function -% $\lambda t.\mu_G(C[\mu_G^{-1}(t)])$ is well defined. -% \item {\bf Monotone metrics:} a metric $\mu_G$ is \emph{monotone} if for -% each term context $C[]$ and for all terms $t_1$,$t_2$ if -% $\mu_G(t_1)\le\mu_G(t_2)$ then $\mu_G(C[t_1])\le\mu_G(C[t_2])$. -% An equivalent definition is that a metric $\mu_G$ is functional if -% for each term context $C[]$ the function -% $\lambda t.\mu_G(C[\mu_G^{-1}(t)])$ is well defined and monotone. -% \end{enumerate} -% -% % Vecchie definizioni -% %\begin{enumerate} -% %\item {\bf Monotony:} a metric $\mu_G$ is \emph{monotone} if for each term -% %context (=term with an hole) $C[]$ and for all terms $t_1$,$t_2$ if -% %$\mu_G(t_1)\le\mu_G(t_2)$ then $\mu_G(C[t_1])\le\mu_G(C[t_2])$ -% %\item {\bf Strict monotony:} a monotone metric $\mu_G$ is \emph{strictly -% %monotone} if for each term context $C[]$ and -% %for all terms $t_1$,$t_2$ if $\mu_G(t_1)=\mu_G(t_2)$ then -% %$\mu_G(C[t_1])=\mu_G(C[t_2])$ -% %\end{enumerate} -% \end{definition} -% -% Typical examples of term metrics are the size of a term, the time required -% to retype it or even an estimate of its ``naturality'' (or simplicity) to be -% defined somehow; the first two are also examples of monotone metrics -% and the third one could probably be defined as to be. So, in the rest -% of this paper, we will restrict to monotone metrics, even if the -% following definitions also work with weaker properties for general metrics. -% Here, however, we are not interested in defining such metrics, but in showing -% how they naturally induce metrics for tactics. -% -% Once a term metric is chosen, we get the notion of a best term (not unique!) -% inhabiting a goal: -% \begin{definition}[Of best terms inhabiting a goal] -% the term $t$ inhabiting a goal $G$ is said to be a best term inhabiting $G$ -% w.r.t. the metric $\mu_G$ when -% $\mu_G(t) = min\{\mu_G(t') / t'\;\mbox{inhabits}\;G\}$. -% \end{definition} -% -% Using the previous notion, we can confront the behavior of two tactics -% on the same goal: -% -% \begin{definition} -% Let $\tau_1$ and $\tau_2$ be two tactics both applyable to a goal $G$ -% such that $\tau_1(G) = (L_1,t_1)$ and $\tau_2(G) = (L_2,t_2)$. -% We say that $\tau_1$ is better or equal than $\tau_2$ for the goal $G$ -% with respect to $\mu_G$ if, for all $l_1$ and $l_2$ lists of best terms -% inhabiting respectively $L_1$ and $L_2$, $\mu_G(t_1(l_1)) \le \mu_G(t_2(l_2))$ -% holds. -% \end{definition} -% Note that confronting in this way ``equivalent'' tactics -% (whose definition is precised in the next section) gives us information -% on which implementation is better; doing the same thing on tactics that -% are not equivalent, instead, gives us information about what tactic to -% apply to obtain the best proof. -% -% A (functional) metric to confront two tactics only on a particular goal -% has the nice property to be a total order, but is quite useless. Hence, -% we will now -% define a bunch of different tactic metrics induced by term metrics -% that can be used to confront the behavior of tactics when applied to -% a generic goal. Some of them will be \emph{deterministic} partial orders; -% others will be total orders, but will provide only a \emph{probabilistic} -% estimate of the behavior. Both kinds of metrics are useful in practice -% when rating tactics implementation. -% \begin{definition}[Of locally deterministic better or equal tactic] -% $\tau_1$ is a locally deterministic (or locally -% uniform) better or equal tactic -% than $\tau_2$ w.r.t. $\mu$ (and in that case we write $\tau_1 \le_{\mu} \tau_2$ -% or simply $\tau_1 \le \tau_2$), when -% for all goals $G$ satisfying the preconditions of both tactics we have that -% $\tau_1$ is better or equal than $\tau_2$ w.r.t. the metric $\mu_G$. -% \end{definition} -% -% \begin{definition}[Of locally deterministic better tactic] -% $\tau_1$ is a locally deterministic (or locally uniform) -% better tactic than $\tau_2$ -% w.r.t. $\mu$ (and in that case we write $\tau_1 <_{\mu} \tau_2$ -% or simply $\tau_1 < \tau_2$), when $\tau_1 \le_{\mu} \tau_2$ and -% exists a goal $G$ satisfying the preconditions of both tactics such that -% $\tau_1$ is better (but not equal!) than $\tau_2$ w.r.t. the metric $\mu_G$. -% \end{definition} -% -% \begin{definition}[Of locally probabilistic better or equal tactic of a factor K] -% $\tau_1$ is said to be a tactic locally probabilistic better or equal of a -% factor $0.5 \le K \le 1$ than $\tau_2$ w.r.t. $\mu$ and a particular expected -% goals distribution when the probability of having $\tau_1$ better or equal than -% $\tau_2$ w.r.t. the metric $\mu_G$ is greater or equal to $K$ when $G$ is -% chosen randomly according to the distribution. -% \end{definition} -% The set of terms being discrete, you can note that a deterministically better -% or equal tactic is a tactic probabilistically better or equal of a factor 1. -% -% -% To end this section, we can remark the strong dependence of the $\le$ -% relation on the choice of metric $\mu$, so that it is easy to find -% two metrics $\mu_1,\mu_2$ such that $\tau_1 <_{\mu_1} \tau_2$ and -% $\tau_2 <_{\mu_2} \tau_1$. Luckily, tough, the main interesting metrics, -% term size, retyping time and naturality, are in practice highly correlated, -% though the correlation of the third one with the previous two could be a -% bit surprising. So, in the following section, we -% will not state what is the chosen term metric; you may think as -% any of them or even at some kind of weighted mean. -% -% -% \section{Equivalent Tactics and Overkilling} -% \label{equivalence} -% We are now interested in using the metrics defined in the previous section -% to confront tactics implementation. Before doing so, though, we have to -% identify what we consider to be different implementations of the -% same tactic. Our approach consists in identifying every implementation -% with the tactic it implements and then defining appropriate notions of -% equivalence for tactics: two equivalent tactics will then be considered -% as equivalent implementations and will be confronted using metrics. -% -% Defining two tactics as equivalent when they can solve exactly the same -% set of goals generating the same set of proof-obligations seems quite natural, -% but is highly unsatisfactory if not completely wrong. The reason is that, for -% equivalent tactics, we would like to have the \emph{property of substitutivity}, -% that is substituting a tactic for an equivalent one in a script should give -% back an error-free script\footnote{A weaker notion of substitutivity is that -% substituting the term generated by a tactic for the term generated by an -% equivalent one in a generic well-typed term should always give back a -% well-typed term.}. In logical frameworks with dependent types, -% without proof-irrelevance and with universes as CIC \cite{Werner} though, -% it is possible for a term to inspect the term of a previous proof, behaving -% in a different way, for example, if the constructive proof of a conjunction -% is made proving the left or right side. -% So, two tactics, equivalent w.r.t. the previous -% definition, that prove $A \vee A$ having at their disposal an -% hypothesis $A$ proving the first one the left and the second one -% the right part of the conjunction, could not be substituted one for the -% other if a subsequent term inspects the form of the generated proof. -% -% Put in another way, it seems quite reasonable to derive equivalence for -% tactics from the definition of an underlying equivalence for terms. -% The simplest form of such an equivalence relation is convertibility -% (up to proof-irrelevance) of closed terms and this is the relation -% we will use in this section and the following one. In particular, -% we will restrict ourselves to CIC and hence to -% $\beta\delta\iota$-convertibility\footnote{The Coq proof-assistant -% introduces the notion of \emph{opaque} and \emph{transparent} terms, -% differing only for the possibility of being inspected. Because the -% user could change the opacity status at any time, the notion of -% convertibility we must conservatively choose for the terms of Coq is -% $\beta\delta\iota$-convertibility after having set all the definitions -% as transparent.}. -% Convertibility, though, is a too restrictive notion that does not take -% in account, for example, commuting conversions. Looking for more suitable -% notions of equivalence is our main open issue for future work. -% -% \begin{definition}[Of terms closed in a local environment] -% A term $t$ is \emph{closed} in a local environment $\Gamma$ when $\Gamma$ -% is defined on any free variable of $t$. -% \end{definition} -% -% \begin{definition}[Of equivalent tactics] -% We define two tactics $\tau_1$ and $\tau_2$ to be equivalent (and -% we write $\tau_1 \approx \tau_2$) when for each goal $G = (\Gamma,T)$ and for -% each list of terms closed in $\Gamma$ and inhabiting the proof-obligations -% generated respectively by $\tau_1$ and $\tau_2$, we have that the result terms -% produced by $\tau_1$ and $\tau_2$ are $\beta\delta\iota$-convertible. -% \end{definition} -% -% Once we have the definition of equivalent tactics, we can use metrics, -% either deterministic or probabilistic, to confront them. In particular, -% in the rest of this section and in the following one we will focus on the -% notion of deterministically overkilling tactic, defined as follows: -% -% \begin{definition}[Of overkilling tactics] -% A tactic $\tau_1$ is (deterministically) overkilling w.r.t. a metric -% $\mu$ when there exists another tactic $\tau_2$ such that -% $\tau_1 \approx \tau_2$ and $\tau_2 <_\mu \tau_1$. -% \end{definition} -% -% Fixing an overkilling tactic $\tau_1$ means replacing it with the -% tactic $\tau_2$ which is the witness of $\tau_1$ being overkilling. -% Note that the fixed tactic could still be overkilling. -% -% The name overkilling has been chosen because most of the time overkilling -% tactics are tactics that do not consider special cases, following the general -% algorithm. While in computer science it is often a good design pattern to -% prefer general solutions to ad-hoc ones, this is not a silver bullet: -% an example comes another time from compiler technology, where -% ad-hoc cases, i.e. optimizations, are greatly valuable if not necessary. -% In our context, ad-hoc cases could be considered either as optimizations, -% or as applications of Occam's razor to proofs to keep the simplest one. -% -% \subsection{A Simple Example of Overkilling Tactic} -% A first example of overkilling tactic in system Coq is -% \emph{Replace}, that works in this way: when the current goal -% is $G = (\Gamma,T)$, the -% tactic ``{\texttt Replace E$_1$ with E$_2$.}'' always produces a new -% principal proof-obligation $(\Gamma, T\{E_2/E_1\})$ and an auxiliary -% proof-obligation $(\Gamma, E_1=E_2)$ and uses the elimination scheme -% of equality on the term $E_1 = E_2$ and the two terms that inhabit the -% obligations to prove the current goal. -% -% To show that this tactic is overkilling, we will provide an example -% in which the tactic fails to find the best term, we will -% propose a different implementation that produces the best term and we will -% show the equivalence with the actual one. -% -% The example consists in applying the tactic in the case in which $E_1$ is -% convertible to $E_2$: the tactic proposes to the user the two -% proof-obligations and then builds the term as described above. We claim -% that the term inhabiting the principal proof-obligation also inhabits -% the goal and, used as the generated term, is surely smaller -% and quicker to retype than the one that is generated in the -% implementation; moreover, it -% is also as natural as the previous one, in the sense that the apparently -% lost information has simply become implicit in the reduction and could -% be easily rediscovered using type-inference algorithms as the one described -% in Coscoy's thesis \cite{YANNTHESIS}. So, a new implementation could -% simply recognize this special case and generate the better term. -% -% We will now show that the terms provided by the two implementations are -% $\beta\delta\iota$-convertible. -% Each closed terms in $\beta\delta\iota$-normal form -% inhabiting the proof that $E_1$ is equal to $E_2$ is equal to the only -% constructor of equality applied to a term convertible to the type of -% $E_1$ and to another term convertible to $E_1$; hence, once the principle -% of elimination of equality is applied to this term, we can first apply -% $\beta$-reduction and then $\iota$-reduction to obtain the term inhabiting -% the principal proof-obligation in which $E_1$ has been replaced by $E_2$. -% Since $E_1$ and $E_2$ are $\beta\delta\iota$-convertible by hypothesis and -% for the congruence properties of convertibility in CIC, we have that the -% generated term is $\beta\delta\iota$-convertible to the one inhabiting the -% principal proof-obligation. -% -% %The simplest example is when $E_1$ and $E_2$ are syntactically equal: -% %in this case the principal obligation proposed is identical to the current goal -% %and the equality elimination introduced could be replaced by the proof of -% %this obligation, leading to a smaller, quicker to retype and also more -% %natural\footnote{Even if naturality is in some way a subjective metric, -% %who would dare say that a proof in which you rewrite an expression with -% %itself is natural?} term. To show that handling this special case in this -% %way is equivalent to the current solution, we have to show that the two -% %terms are $\beta\delta\iota$-convertible when so are the subterms inhabiting -% %the obligations. -% %The only closed term in $\beta\delta\iota$-normal form -% %inhabiting the proof that $E_1$ is equal to $E_1$ is the only constructor -% %of equality applied to the type of $E_1$ and to $E_1$; hence we can -% %first apply $\beta$-reduction and then $\iota$-reduction to obtain -% %the second proof in which $E_1$ has been replaced by $E_1$, i.e. the -% %second proof. -% -% %So, for this example, the simple fix to avoid overkilling is checking -% %if $E_1$ and $E_2$ are syntactically equal and in case omitting the elimination; -% %curiously, this is done in Coq in the implementation of a very similar tactic -% %called \emph{Rewriting}. -% -% -% %A second, more involved, example is when $E_1$ and $E_2$ are not -% %syntactically equal, but $E_1$ reduces to $E_2$. Even in this case -% %the elimination could simply be avoided because the typing rules of -% %the logical system ensures us that a term that inhabits the principal -% %obligation also inhabits the current goal: the proof is equivalent to the -% %previous one, but for the last step in which $E_2$ is actually substituted for -% %$E_1$; by hypothesis, though, the two terms are $\beta\delta\iota$-convertible -% %and hence the thesis. -% -% %Surely smaller and faster to retype, the new term is also as natural -% %as the previous one, in the sense that the apparently lost information has -% %simply become implicit in the reduction and could be easily rediscovered -% %using type-inference algorithms as the one described in chapter ??? -% %of Coscoy's thesis \ref{YANNTHESIS}. -% -% This example may seem quite stupid because, if the user is already able to -% prove the principal proof-obligation and because this new goal is totally -% equivalent to the original one, the user could simply redo the same steps -% without applying the rewriting at all. Most of the time, though, the -% convertibility of the two terms could be really complex to understand, -% greatly depending on the exact definitions given; indeed, the user could -% often be completely unaware of the convertibility of the two terms. Moreover, -% even in the cases in which the user understands the convertibility, the -% tactic has the important effect of changing the form of the current -% goal in order to simplify the task of completing the proof, which is the reason -% for the user to apply it. -% -% ~\\ -% -% The previous example shows only a very small improvement in the produced -% term and could make you wonder if the effort of fixing overkilling and -% more in general if putting more attention to terms when implementing -% tactics is really worth the trouble. In the next section we describe as -% another example a concrete experience of fixing a complex reflexive tactic -% in system Coq that has lead to really significant improvements in term -% size, retyping time and naturality. -% -% \section{Fixing Overkilling: a Concrete Experience} -% Coq provides a reflexive tactic called \emph{Ring} to do associative-commutative -% rewriting in ring and semi-ring structures. The usual usage is, -% given the goal $E_1 = E_2$ where $E_1$ and $E_2$ are two expressions defined -% on the ring-structure, to prove the goal reducing it to proving -% $E'_1 = E'_2$ where $E'_i$ is the normal form of $E_i$. In fact, once -% obtained the goal $E'_1 = E'_2$, the tactic also tries to apply simple -% heuristics to automatically solve the goal. -% -% The actual implementation of the tactic by reflexion is quite complex and -% is described in \cite{Ring}. The main idea is described in Fig. \ref{ring1}: -% first of all, an inductive data type to describe abstract polynomial is -% made available. On this abstract polynomial, using well-founded recursion in -% Coq, a normalization function named $apolynomial\_normalize$ is defined; -% for technical reasons, the abstract data type of normalized polynomials -% is different from the one of un-normalized polynomials. Then, two -% interpretation functions, named $interp\_ap$ and $interp\_sacs$ are given -% to map the two forms of abstract polynomials to the concrete one. Finally, -% a theorem named $apolynomial\_normalize\_ok$ stating the equality of -% the interpretation of an abstract polynomial and the interpretation of -% its normal form is defined in Coq using well-founded induction. The above -% machinery could be used in this way: to prove that $E^I$ is equal to its -% normal form $E^{IV}$, the tactic computes an abstract polynomial $E^{II}$ that, -% once interpreted, reduces to $E^{I}$, and such that the interpretation -% of $E^{III} = (apolynomial\_normalize E^{II})$ could be shown to be equal -% to $E^{IV}$ applying $apolynomial\_normalize\_ok$. -% -% %such that $(interp\_ap\;E^{II})$ could be proved (theorem -% %$apolynomial\_normalize\_ok$) in Coq to be equal to -% %$(interp\_sacs\;E^{III})$ where -% %$E^{III} = (apolynomial\_normalize E^{II})$ is another abstract polynomial -% %in normal form and the three functions $interp\_ap$, $interp\_sacs$ and -% %$apolynomial\_normalize$ could all be defined inside Coq using well-founded -% %recursion/induction on the abstract polynomial definition. -% -% % \myincludegraphics{ring1}{t}{12cm}{Reflexion in Ring}{Reflexion in Ring} -% % \myincludegraphics{ring2}{t}{10cm}{Ring implementation (first half)}{Ring implementation (first half)} -% -% In Fig. \ref{ring2} the first half of the steps taken -% by the Ring tactic to prove $E_1 = E_2$ are shown\footnote{Here $E_1$ -% stands for $E^I$.}. -% The first step is replacing $E_1 = E_2$ with $(interp\_ap\;E^{II}_1) = E_2$, -% justifying the rewriting using the only one constructor of equality due to -% the $\beta\delta\iota$-convertibility of $(interp\_ap\;E^{II}_1)$ with $E_1$. -% The second one is replacing $(interp\_ap\;E^{II})$ with -% $(interp\_sacs\;E^{III})$, justifying the rewriting using -% $apolynomial\_normalize\_ok$. -% -% Next, the two steps are done again on the left part of the equality, -% obtaining $(interp\_sacs\;E^{III}_1) = (interp\_sacs\;E^{III}_2)$, -% that is eventually solved trying simpler tactics as \emph{Reflexivity} or left -% to the user. -% -% The tactic is clearly overkilling, at least due to the usage of rewriting for -% convertible terms. Let's consider as a simple example the session in Fig. -% \ref{session}: -% \begin{figure}[t] -% %\begin{verbatim} -% \mbox{\hspace{3cm}\tt Coq $<$ Goal ``0*0==0``.}\\ -% \mbox{\hspace{3cm}\tt 1 subgoal}\\ -% ~\\ -% \mbox{\hspace{3cm}\tt ~~=================}\\ -% \mbox{\hspace{3cm}\tt ~~~~``0*0 == 0``}\\ -% ~\\ -% \mbox{\hspace{3cm}\tt Unnamed\_thm $<$ Ring.}\\ -% \mbox{\hspace{3cm}\tt Subtree proved!} -% %\end{verbatim} -% \caption{A Coq session.} -% \label{session} -% \end{figure} -% in Fig. \ref{before} the $\lambda$-term created by the -% original overkilling implementation of Ring is shown. Following the previous -% explanation, it should be easily understandable. In particular, the -% four rewritings are clearly visible as applications of $eqT\_ind$, as -% are the two applications of $apolynomial\_normalize\_ok$ and the -% three usage of reflexivity, i.e. the two applications of $refl\_eqT$ -% to justify the rewritings on the left and right members of the equality -% and the one that ends the proof. -% -% Let's start the analysis of overkilling in this implementation: -% \begin{description} -% \item[Always overkilling rewritings:] as already stated, four of the rewriting -% steps are always overkilling because the rewritten term is convertible to -% the original one due to the tactic implementation. -% As proved in the previous section, all these rewritings could be simply -% removed obtaining an equivalent tactic. -% \item[Overkilling rewritings due to members already normalized:] it may happen, -% as in our example, that one (or even both) of the two members is already in -% normal form. In this case the two rewriting steps for that member could be -% simply removed obtaining an equivalent tactic as shown in the previous section. -% \item[Rewriting followed by reflexivity:] after having removed all -% the overkilling rewritings, the general form of the $\lambda$-term produced -% for $E_1 = E_2$ is the application of two rewritings ($E'_1$ for $E_1$ and -% $E'_2$ for $E_2$), followed by a proof of $E'_1 = E'_2$. In many cases, -% $E'_1$ and $E'_2$ are simply convertible and so the tactic finishes the -% proof with an application of reflexivity to prove the equivalent goal -% $E'_1 = E'_1$. -% A smaller and also more natural solution is just to -% rewrite $E'_1$ for $E_1$ and then proving $E'_1 = E_2$ applying the lemma -% stating the symmetry of equality to the proof of $E_2 = E'_2$. -% The equivalence to the original -% tactic is trivial by $\beta\iota$-reduction because the lemma is proved -% exactly doing the rewriting and then applying reflexivity: -% $$ -% \begin{array}{l} -% \lambda A:Type.\\ -% \hspace{0.2cm}\lambda x,y:A.\\ -% \hspace{0.4cm}\lambda H:(x==y).\\ -% \hspace{0.6cm}(eqT\_ind~A~x~ [x:A]a==x~(refl\_eqT~A~x)~y~H) -% \end{array} -% $$ -% \end{description} -% In Fig. \ref{after} is shown the $\lambda$-term created by the same -% tactic after having fixed all the overkilling problems described above. -% -% \begin{figure} -% \begin{verbatim} -% Unnamed_thm < Show Proof. -% LOC: -% Subgoals -% Proof: -% (eqT_ind R -% (interp_ap R Rplus Rmult ``1`` ``0`` Ropp (Empty_vm R) -% (APmult AP0 AP0)) [r:R]``r == 0`` -% (eqT_ind R -% (interp_sacs R Rplus Rmult ``1`` ``0`` Ropp (Empty_vm R) -% Nil_varlist) [r:R]``r == 0`` -% (eqT_ind R -% (interp_ap R Rplus Rmult ``1`` ``0`` Ropp (Empty_vm R) AP0) -% [r:R] -% ``(interp_sacs R Rplus Rmult 1 r Ropp (Empty_vm R) Nil_varlist) -% == r`` -% (eqT_ind R -% (interp_sacs R Rplus Rmult ``1`` ``0`` Ropp (Empty_vm R) -% Nil_varlist) -% [r:R] -% ``(interp_sacs R Rplus Rmult 1 r Ropp (Empty_vm R) -% Nil_varlist) == r`` (refl_eqT R ``0``) -% (interp_ap R Rplus Rmult ``1`` ``0`` Ropp (Empty_vm R) AP0) -% (apolynomial_normalize_ok R Rplus Rmult ``1`` ``0`` Ropp -% [_,_:R]false (Empty_vm R) RTheory AP0)) ``0`` -% (refl_eqT R ``0``)) -% (interp_ap R Rplus Rmult ``1`` ``0`` Ropp (Empty_vm R) -% (APmult AP0 AP0)) -% (apolynomial_normalize_ok R Rplus Rmult ``1`` ``0`` Ropp -% [_,_:R]false (Empty_vm R) RTheory (APmult AP0 AP0))) ``0*0`` -% (refl_eqT R ``0*0``)) -% \end{verbatim} -% \caption{The $\lambda$-term created by the original overkilling implementation} -% \label{before} -% \end{figure} -% -% \begin{figure} -% \begin{verbatim} -% Unnamed_thm < Show Proof. -% LOC: -% Subgoals -% Proof: -% (sym_eqT R ``0`` ``0*0`` -% (apolynomial_normalize_ok R Rplus Rmult ``1`` ``0`` Ropp -% [_,_:R]false (Empty_vm R) RTheory (APmult AP0 AP0))) -% \end{verbatim} -% \caption{The $\lambda$-term created by the new implementation} -% \label{after} -% \end{figure} -% -% \clearpage -% \subsection{A Quantitative Analysis of the Gain Obtained} -% Let's now try a quantitative analysis of the gain with respect to term size, -% retyping time and naturality, considering the two interesting cases of -% no member or only one member already in normal form\footnote{If the two -% members are already in normal form, the new implementation simply applies -% once the only constructor of the equality to one of the two members. The tactic -% is also implemented to do the same thing also when the two members are not yet -% in normal forms, but are already convertible. We omit this other improvement -% in our analysis.}. -% -% \subsubsection{Term Size.} -% \paragraph{Terms metric definition:} given a term $t$, the metric $|.|$ associates to it its number of nodes $|t|$. -% \paragraph{Notation}: $|T|$ stands for the number of nodes in the actual parameters -% given to $interp\_ap$, $interp\_sacs$ and $apolynomial\_normalize\_ok$ to -% describe the concrete (semi)ring theory and the list of non-primitive -% terms occurring in the goal to solve. In the example in figures \ref{before} -% and \ref{after}, $|T|$ is the number of nodes in -% \begin{texttt}[R Rplus Rmult ``1`` ``0`` Ropp (Empty\_vm R)]\end{texttt}. -% $|R|$ stands for the number of nodes in the term which is the carrier -% of the ring structure. In the same examples, $|R|$ is simply 1, i.e. -% the number of nodes in \begin{texttt}R\end{texttt}. -% -% \begin{displaymath} -% \begin{array}{l} -% \mbox{\bf Original version:}\\ -% \begin{array}{ll} -% 1 + (|E^{II}_1| + |T| + 1) + |E_2| + |E_1| + & -% \mbox{(I rewriting Left)} \\ -% 1 + |E_1| + & -% \mbox{(justification)} \\ -% 1 + (|E^{III}_1| + |T| + 1) + |E_2| + (|E^{II}_1| + |T| + 1) + & -% \mbox{(II rewriting Left)} \\ -% (|E^{II}_1| + |T| + 1) + & -% \mbox{(justification)} \\ -% 1 + (|E^{II}_2| + |T| + 1) + (|E^{III}_1| + |T| + 1) + |E_2| + & -% \mbox{(I rewriting Right)} \\ -% 1 + |E_2| + & -% \mbox{(justification)} \\ -% 1 + (|E^{III}_2| + |T| + 1) + (|E^{III}_1| + |T| + 1) + & -% \mbox{(II rewriting Right)} \\ -% ~(|E^{II}_2| + |T| + 1) +& \\ -% (|E^{II}_2| + |T| + 1) + & -% \mbox{(justification)} \\ -% 1 + |E_1| = & -% \mbox{(reflexivity application)} \\ -% \hline -% 4|E_1| + 2|E_2| + 3|E^{II}_1| + 3|E^{II}_2| + 3|E^{III}_1| + |E^{III}_2| +~ & -% \mbox{\bf Total number} \\ -% ~10|T| + 17 & -% \end{array} -% \end{array} -% \end{displaymath} -% -% \begin{displaymath} -% \begin{array}{l} -% \mbox{\bf New version, both members not in normal form:}\\ -% \begin{array}{ll} -% 1 + |R| + |E_1| + |E'_2| + & -% \mbox{(Rewriting Right)} \\ -% 1 + |T| + |E^{II}_2| + & -% \mbox{(justification)} \\ -% 1 + |R| + |E'_2| + |E'_1| + |E_2| + & -% \mbox{(Symmetry application)} \\ -% 1 + |T| + |E^{II}_1| = & -% \mbox{(justification)} \\ -% \hline -% 2|E_1| + |E_2| + |E^{II}_1| + |E^{II}_2| + 2|E'_2| + 2|T| +~ & -% \mbox{\bf Total number} \\ -% ~2|R| + 4 & \\ -% ~ & -% \end{array} -% \end{array} -% \end{displaymath} -% \begin{displaymath} -% \begin{array}{l} -% \mbox{\bf New version, only the first member not in normal form:}\\ -% \begin{array}{ll} -% 1 + |R| + |E_1| + |E'_2| + & -% \mbox{(Rewriting)} \\ -% 1 + |T| + |E^{II}_2| = & -% \mbox{(justification)} \\ -% \hline -% |E_1| + |E'_2| + |E^{II}_2| + |T| + |R| + 2~~~ & -% \mbox{\bf Total number} \\ -% ~ & -% \end{array} -% \end{array} -% \end{displaymath} -% -% While the overall space complexity of the terms generated by the new -% implementation is asymptotically equal to the one of the old implementation, -% all the constants involved are much smaller, but for the one of -% $E'_2$ (the two normal forms) that before was 0 and now is -% equal to 2. Is it possible to have goals for which the new implementation -% behaves worst than the old one? Unfortunately, yes. This happens when -% the size of the two normal forms $E'_1$ and $E'_2$ is greatly huger than -% ($E^{II}_1 + |T| + 1)$ and $(E^{II}_2 + |T| + 1)$. This happens when -% the number of occurrences of non-primitive terms is much higher than -% the number of non-primitive terms and the size of them is big. More -% formally, being $m$ the number of non-primitive terms, $d$ the average -% size and $n$ the number of occurrences, the new implementation creates bigger -% terms than the previous one if -% \begin{displaymath} -% n \log_2 m + m d < n d -% \end{displaymath} -% where the difference between the two members is great enough to hide the gain -% achieved lowering all the other constants. -% The logarithmic factor in the previous -% formula derives from the implementation of the map of variables to -% non-primitive terms as a tree and the representation of occurrences with -% the path inside the tree to retrieve the term. -% -% To fix the problem, for each non-primitive term occurring more than once -% inside the normal forms, we can use a \emph{let \ldots in} local definition -% to bind it to a fresh identifier; then we replace every occurrence of -% the term inside the normal forms with the appropriate -% identifier\footnote{This has not yet been implemented in Coq.}. -% In this way, the above inequation becomes -% \begin{displaymath} -% n \log_2 m + m d < n + m d -% \end{displaymath} -% that is never satisfied. -% -% Here it is important to stress how the latest problem was easily -% overlooked during the implementation and has been discovered -% only during the previous analysis, strengthening our belief in -% the importance of this kind of analysis for tactic implementations. -% -% In the next two paragraphs we will consider only the new implementation -% with the above fixing. -% -% \subsubsection{Retyping Time.} -% \paragraph{Terms metric definition:} given a term $t$, the metric $|.|$ associates to it the time $|t|$ required to retype it. -% -% Due to lack of space, we will omit a detailed analysis as the one given -% for terms size. Nevertheless, we can observe that the retyping time required -% is surely smaller because all the type-checking operations required for -% the new implementation are already present in the old one, but for the -% type-checking of the two normal forms, that have fewer complexity -% than the type-checking of the two abstract normal forms, and the -% \emph{let \ldots in} definitions that have the same complexity of the -% type-checking of the variable map. Moreover, the quite expensive operation -% of computing the two normal forms is already done during proof construction. -% -% %In the case in which both the members of the equality are not in normal form, -% %we can't expect a great improvement in retyping time but for very cheap -% %normalizations; this because retyping time is dominated by the factor due to -% %$apolynomial\_normalize$ that is present in both implementations and is -% %usually much higher than the factor due to the removed applications of -% %$interp\_ap$, $interp\_sacs$ and the eliminations of equality. -% -% %The situation is very different if one of the two members is already -% %convertible to its normal form and the reduction involved in the normalization -% %is expensive. In this rather unlikely case, the new implementation -% %avoids the reduction at all, roughly halving the overall retyping time. -% -% In section \ref{benchmarks} we present some benchmarks to give an idea of -% the real gain obtained. -% -% \subsubsection{Naturality.}~\\~\\ -% The idea behind the \emph{Ring} tactic is to be able to prove -% an equality showing that both members have the same normal form. -% This simply amounts to show that each member is equal to the same -% normal form, that is exactly what is done in the new implementation. -% Indeed, every step that belonged to the old implementation and has been -% changed or removed to fix overkilling used to lead to some unnatural step: -% \begin{enumerate} -% \item The fact that the normalization is not done on the concrete -% representation, but passing through two abstract ones that are -% interpreted on the concrete terms is an implementative detail that -% was not hidden as much as possible as it should be. -% \item Normalizing a member of the equality that is already in normal form, -% is illogical and so unnatural. Hence it should be avoided, but it was not. -% \item The natural way to show $A=B$ under the hypothesis $B=A$ is just -% to use the symmetric property of equality. Instead, the old implementation -% rewrote $B$ with $A$ using the hypothesis and proved the goal by reflexivity. -% \item Using local definitions (\emph{let \ldots in}) as abbreviations -% rises the readability of the proof by shrinking its size removing -% subexpressions that are not involved in the computation. -% \end{enumerate} -% -% \subsection{Some Benchmarks} -% \label{benchmarks}To understand the actual gain in term size and retyping -% time on real-life examples, we have done some benchmarks on the whole set -% of theorems in the standard library of Coq that use the Ring tactic. The -% results are shown in table~\ref{benchs}. -% -% Term size is the size of the disk dump of the terms. Re-typing time is the -% user time spent by Coq in proof-checking already parsed terms. The reduction -% of the terms size implies also a reduction in Coq parsing time, that is -% difficult to compute because Coq files do not hold single terms, but whole -% theories. Hence, the parsing time shown is really the user time spent by Coq -% to parse not only the terms on which we are interested, but also all the -% terms in their theories and the theories on which they depend. So, this last -% measure greatly under-estimates the actual gain. -% -% Every benchmark has been repeated 100 times under different load conditions on -% a 600Mhz Pentium III bi-processor equipped with 256Mb RAM. The timings shown -% are mean values. -% -% \begin{table} -% \begin{center} -% \begin{tabular}{|l|c|c|c|} -% \hline -% & ~Term size~ & Re-typing time & Parsing time\\ -% \hline -% Old implementation & 20.27Mb & 4.59s & 2.425s \\ -% \hline -% New implementation & 12.99Mb & 2.94s & 2.210s \\ -% \hline -% Percentage reduction & 35.74\% & 35.95\% & 8.87\% \\ -% \hline -% \end{tabular} -% \end{center} -% \caption{Some benchmarks} -% \label{benchs} -% \end{table} -% -% \section{Conclusions and Future Work} -% Naive ways of implementing tactics lead to low quality terms that are -% difficult to inspect and process. To improve the situation, we show -% how metrics defined for terms naturally induce metrics for tactics and -% tactics implementation and we advocate the usage of such metrics for -% tactics evaluation and implementation. In particular, metrics could -% be used to analyze the quality of an implementation or could be used at run -% time by a tactic to choose what is the best way to proceed. -% -% To safely replace a tactic implementation with another one, it is important -% to define when two tactics are equivalent, i.e. generate equivalent terms. -% In this work, the equivalence relation chosen for terms has simply been -% $\beta\delta\iota$-convertibility, that in many situations seems too strong. -% Hence, an important future work is the study of weaker forms of term -% equivalence and the equivalence relations they induce on tactics. In particular, -% it seems that proof-irrelevance, $\eta$-conversion and commuting conversions -% must all be considered in the definition of a suitable equivalence relation. + We would like to stress that the \hbugs{} architecture has no dependency + on either the hint or the status representation: the only message parts + that are fixed are those representing the administrative messages + (the envelopes in the \wss{} terminology). In particular, the broker can + manage at the same time several sessions working on different status/hints + formats. Of course, there must be an agreement between the clients + and the tutors on the format of the data exchanged. + + In our implementation the client does not trust the tutors hints: + being encoded as references to available tactics imply + that an \hbugs{} client, at the receipt of a hint, simply try to replay + the work + done by a tutor on the local copy of the proof. The application of the hint + can even fail to type check and the client copy of the proof can be left + undamaged after spotting the error. Note, however, that it is still + possible to implement a complex tutor that looks for a proof doing + backtracking and that + send back to the client a hint whose argument is a witness (a trace) of + the proof found: the client applies the hint reconstructing (and checking + the correctness of) the proof from the witness, without having to + re-discover the proof itself. + + An alternative implementation where the tutors are trusted would simply + send back to the client a new proof-status. Upon receiving the + proof-status, the client would just override its current proof status with + the suggested one. In the case of those clients which are implemented + using proof-objects (as the Coq proof-assistant, for instance), it is + still possible for the client to type-check the proof-object and reject + wrong hints. The systems that are not based on proof-objects + (as PVS, NuPRL, etc.), instead, must completely trust the new proof-status. + In this case the \hbugs{} architecture would need at least to be extended + with clients-tutors authentication. + + \paragraph{Registries} + Being central in the \hbugs{} architecture, the broker is also responsible + of housekeeping operations both for clients and tutors. These operations are + implemented using three different data structures called \emph{registries}: + clients registry, tutors registry and \musings{} registry. + + In order to use the suggestion engine a client should register itself to the + broker and subscribe to one or more tutors. The registration phase is + triggered by the client using the \texttt{Register\_client} method of the + broker to send him an unique identifier and its base URI as a \ws{}. After + the registration, the client can use the \texttt{List\_tutors} method of the + broker to get a list of available tutors. Eventually the client can + subscribe to one or more of these using the \texttt{Subscribe} method of the + broker. Clients can also unregister from brokers using + \texttt{Unregister\_client} method. + + The broker keeps track of both registered clients and clients' subscriptions + in the clients registry. + + In order to be advertised to clients during the subscription phase, tutors + should register to the broker using the \texttt{Register\_tutor} method of + the broker. This method is really similar to \texttt{Register\_client}: + tutors are required to send an unique identifier and a base URI for their + \ws{}. + Additionally tutors are required to send an human readable description of + their capabilities; this information could be used by the client user to + decide which tutors she wants to subscribe to. As the clients, tutors can + unregister from brokers using \texttt{Unregister\_broker} method. + + Each time the client status changes, it get sent sent to the + broker using its \texttt{Status} method. Using both the clients registry (to + lookup the client's subscription) and the tutors registry (to check if some tutors + have unsubscribed), the broker is able to decide to which tutors the + new status have to be forwarded. +% \ednote{CSC: qui o nei lavori futuri parlare +% della possibilit\'a di avere un vero brocker che multiplexi le richieste +% dei client localizzando i servizi, etc.} + + The forwarding operation is performed using the \texttt{Start\_musing} + method of the tutors, that is a request to start a new computation + (\emph{\musing{}}) on a given status. The return value of + \texttt{Start\_musing} is a + \musing{} identifier that is saved in the \musings{} registry along with + the identifier of the client that triggered the \musing{}. + + As soon as a tutor completes an \musing{}, it informs the broker + using its \texttt{Musing\_completed} method; the broker can now remove the + \musing{} entry from the \musings{} registry and, depending on its outcome, + inform the client. In case of success one of the \texttt{Musing\_completed} + arguments is a hint to be sent to the client; otherwise there is no need to + inform him and the \texttt{Musing\_completed} method is called + just to update the \musings{} registry. + + Consulting the \musings{} registry, the broker is able to know, at each + time, which \musings{} are in execution on which tutor. This peculiarity is + exploited by the broker on invocation of the \texttt{Status} method. + Receiving a new status from the client implies indeed that the previous + status no longer exists and all \musings{} working on it should be stopped: + additionally to the already described behavior (i.e. starting new + \musings{} on the received status), the broker takes also care of stopping + ongoing computation invoking the \texttt{Stop\_musing} method of the tutors. + +%CASSATO +% \paragraph{\wss{}} +% As already discussed, all \hbugs{} actors act as \wss{} offering one or more +% services to neighbor actors. To grant as most accessibility as possible to +% our \wss{} we have chosen to bind them using the HTTP/POST\footnote{Given +% that our proof assistant was entirely developed in the Objective Caml +% language, we have chosen to develop also \hbugs{} in that language in order +% to maximize code reuse. To develop \wss{} in Objective Caml we have +% developed an auxiliary generic library (\emph{O'HTTP}) that can be used to +% write HTTP 1.1 Web servers and abstracts over GET/POST parsing. This library +% supports different kinds of Web servers architectures, including +% multi-process and multi-threaded ones.} bindings described in +% \cite{wsdlbindings}. + + \paragraph{Tutors} + Each tutor exposes a \ws{} interface and should be able to work, not only for + many different clients referring to a common broker, but also for many + different brokers. The potential high number of concurrent clients imposes + a multi-threaded or multi-process architecture. + + Our current implementation is based on a multi threaded architecture + exploiting the capabilities of the O'HTTP library \cite{zack}. Each tutor is + composed by one always running thread plus an additional thread for each + \musing{}. + One thread is devoted to listening for incoming \ws{} requests; when a + request is received the control is passed to a second thread, created on the + fly, that handle the incoming request (usual one-thread-per-request approach + in web servers design). + In particular if the received request is \texttt{Start\_musing}, a new thread is + spawned to handle it; the thread in duty to handle the HTTP request + returns an HTTP response containing the identifier of the just started + \texttt{musing}, and then dies. If the received request is + \texttt{Stop\_musing}, instead, the spawned thread kills the thread + responsible for the \texttt{musing} whose identifier is the argument + of the \texttt{Stop\_musing} method. + + This architecture turns out to be scalable and allows the running threads + to share the cache of loaded (and type-checked) theorems. + As we will explain in Sect. \ref{tutors}, this feature turns out to be + really useful for tactics that rely on a huge but fixed set of lemmas, + as every reflexive tactic. + + The implementation of a tutor within the described architecture is not that + difficult having a language with good threading capabilities (as OCaml has) + and a pool of already implemented tactics (as \texttt{gTopLevel} has). + Working with threads is known to be really error prone due to + concurrent programming intrinsic complexity. Moreover, there is a + non-neglectable part of code that needs to be duplicated in every tutor: + the code to register the tutor to the broker and to handle HTTP requests; + the code to manage the creation and termination of threads; and the code for + parsing the requests and serializing the answers. As a consequence we + have written a generic implementation of a tutor which is parameterized + over the code that actually proposes the hint and over some administrative + data (as the port the tutor will be listening to). + + The generic tutor skeleton is really helpful in writing new tutors. + Nevertheless, the code obtained by converting existing tactics into tutors + is still quite repetitive: every tutor that wraps a tactic has to + instantiate its own copy of the proof-engine kernel and, for each request, + it has to override its status, guess the tactic arguments, apply the tactic + and, in case of success, send back a hint with the tactic name and the + chosen arguments. Of course, the complex part of the work is guessing the + right arguments. For the simple case of tactics that do not require + any argument, though, we are able to automatically generate the whole + tutor code given the tactic name. Concretely, we have written a + tactic-based tutor template and a script that parses an XML file with + the specification of the tutor and generates the tutor's code. + The XML file describes the tutor's port, the code to invoke the tactic, + the hint that is sent back upon successful application and a + human readable explanation of the tactic implemented by the tutor. + +\section{The Implemented \hbugs Tutors} +\label{tutors} +To test the \hbugs{} architecture and to assess the utility of a suggestion +engine for the end user, we have implemented several tutors. In particular, +we have investigated three classes of tutors: +\begin{enumerate} + \item \emph{Tutors for beginners}. These are tutors that implement tactics + which are neither computationally expensive nor difficult to understand: + an expert user can always understand if the tactic can be applied or not + without having to try it. For example, the following implemented tutors + belong to this class: + \begin{itemize} + \item \emph{Assumption Tutor}: it ends the proof if the thesis is + equivalent\footnote{In our implementation, the equivalence relation + imposed by the logical framework is \emph{convertibility}. Two + expressions are convertible when they reduce to the same normal form. + Two ``equal'' terms depending on free variables can be non-convertible + since free variables stop the reduction. For example, $2x$ is convertible + with $(3-1)x$ because they both reduce to the same normal form + $x + x + 0$; but $2x$ is not convertible to $x2$ since the latter is + already in normal form.} + to one of the hypotheses\footnote{ + In some cases, especially when non-trivial computations are involved, + the user is totally unable to figure out the convertibility of two terms. + In these cases the tutor becomes handy also for expert users.}. + \item \emph{Contradiction Tutor}: it ends the proof by \emph{reductio ad + adsurdum} if one hypothesis is equivalent to $False$. + \item \emph{Symmetry Tutor}: if the goal thesis is an equality, it + suggests to apply the commutative property. + \item \emph{Left/Right/Exists/Split/Reflexivity/Constructor Tutors}: + the Constructor Tutor suggests to proceed in the proof by applying one + or more constructors when the goal thesis is an inductive type or a + proposition inductively defined according to the declarative + style\footnote{An example of a proposition that can be given in + declarative style is the $\le$ relation over natural numbers: + $\le$ is the smallest relation + such that $n \le n$ for every $n$ and $n \le m$ for every $n,m$ such + that $n \le p$ where $p$ is the predecessor of $m$. Thus, a proof + of $n \le n$ is simply the application of the first constructor to + $n$ and a proof of $n \le m$ is the application of the second + constructor to $n,m$ and a proof of $n \le m$.}. + Since disjunction, conjunction, existential quantification and + Leibniz equality are particular cases of inductive propositions, + all the other tutors of this class are instantiations of the + the Constructor tactic. Left and Right suggest to prove a disjunction + by proving its left/right member; Split reduces the proof of a + conjunction to the two proof of its members; Exists suggests to + prove an existential quantification by providing a + witness\footnote{This task is left to the user.}; Reflexivity proves + an equality whenever the two sides are convertible. + \end{itemize} + Beginners, when first faced with a tactic-based proof-assistant, get + lost quite soon since the set of tactics is large and their names and + semantics must be remembered by heart. Tutorials are provided to guide + the user step-by-step in a few proofs, suggesting the tactics that must + be used. We believe that our beginners tutors can provide an auxiliary + learning tool: after the tutorial, the user is not suddenly left alone + with the system, but she can experiment with variations of the exercises given + in the tutorial as much as she like, still getting useful suggestions. + Thus the user is allowed to focus on learning how to do a formal proof + instead of wasting efforts trying to remember the interface to the system. + \item \emph{Tutors for Computationally Expensive Tactics}. Several tactics have + an unpredictable behavior, in the sense that it is unfeasible to understand + whether they will succeed or they will fail when applied and what will be + their result. Among them, there are several tactics either computationally + expensive or resource consuming. In the first case, the user is not + willing to try a tactic and wait for a long time just to understand its + outcome: she would prefer to keep on concentrating on the proof and + have the tactic applied in background and receive out-of-band notification + of its success. The second case is similar, but the tactic application must + be performed on a remote machine to avoid overloading the user host + with several concurrent resource consuming applications. + + Finally, several complex tactics and in particular all the tactics based + on reflexive techniques depend on a pretty large set of definitions, lemmas + and theorems. When these tactics are applied, the system needs to retrieve + and load all the lemmas. Pre-loading all the material needed by every + tactic can quickly lead to long initialization times and to large memory + footstamps. A specialized tutor running on a remote machine, instead, + can easily pre-load the required theorems. + + As an example of computationally expensive task, we have implemented + a tutor for the \emph{Ring} tactic \cite{ringboutin}. + The tutor is able to prove an equality over a ring by reducing both members + to a common normal form. The reduction, which may require some time in + complex cases, + is based on the usual commutative, associative and neutral element properties + of a ring. The tactic is implemented using a reflexive technique, which + means that the reduction trace is not stored in the proof-object itself: + the type-checker is able to perform the reduction on-the-fly thanks to + the conversion rules of the system. As a consequence, in the library there + must be stored both the algorithm used for the reduction and the proof of + correctness of the algorithm, based on the ring axioms. This big proof + and all of its lemmas must be retrieved and loaded in order to apply the + tactic. The Ring tutor loads and caches all the required theorems the + first time it is contacted. + \item \emph{Intelligent Tutors}. Expert users can already benefit from the previous + class of tutors. Nevertheless, to achieve a significative production gain, + they need more intelligent tutors implementing domain-specific theorem + provers or able to perform complex computations. These tutors are not just + plain implementations of tactics or decision procedures, but can be + more complex software agents interacting with third-parties software, + such as proof-planners, CAS or theorem-provers. + + To test the productivity impact of intelligent tutors, we have implemented + a tutor that is interfaced with the \helm{} + Search-Engine\footnote{\url{http://helm.cs.unibo.it/library.html}} and that + is able to look for every theorem in the distributed library that can + be applied to proceed in the proof. Even if the tutor deductive power + is extremely limited\footnote{We do not attempt to check if the new goals + obtained applying a lemma can be automatically proved or, even better, + automatically disproved to reject the lemma.}, it is not unusual for + the tutor to come up with precious hints that can save several minutes of + work that would be spent in proving again already proven results or + figuring out where the lemmas could have been stored in the library. +\end{enumerate} + +\section{Conclusions and Future Work} +\label{conclusions} + In this paper we described a suggestion engine architecture for + proof-assistants: the client (a proof-assistant) sends the current proof + status to several distributed \wss{} (called tutors) that try to progress + in the proof and, in case of success, send back an appropriate hint + (a proof-plan) to the user. The user, that in the meantime was able to + reason and progress in the proof, is notified with the hints and can decide + to apply or ignore them. A broker is provided to decouple the clients and + the tutors and to allow the client to locate and invoke the available remote + services. The whole architecture is an instance of the MONET architecture + for Mathematical \wss{}. It constitutes a reimplementation of the core + features of the pioneering \OmegaAnts{} system in the new \wss{} + framework. + + A running prototype has been implemented as part of the + \helm{} project \cite{helm} + and we already provide several tutors. Some of them are simple tutors that + try to apply one or more tactics of the \helm{} Proof-Engine, which is also + our client. We also have a much more complex tutor that is interfaced + with the \helm{} Search-Engine and looks for lemmas that can be directly + applied. + + Future works comprise the implementation of new features and tutors, and + the embedding of the system in larger test cases. For instance, one + interesting case study would be interfacing a CAS as Maple to the + \hbugs{} broker, developing at the same time a tutor that implements the + Field tactic of Coq, which proves the equality of two expressions in an + abstract field by reducing both members to the same normal form. CASs can + produce several compact normal forms, which are particularly informative + to the user and that may suggest how to proceed in a proof. Unfortunately, + CASs do not + provide any certificate about the correctness of the simplification. On + the contrary, the Field tactic certifies the equality of two expressions, + but produces normal forms that are hardly a simplification of the original + formula. The benefits for the CAS would be obtained by using the Field tutor + to certify the CAS simplifications, proving that the Field normal form + of an expression is preserved by the simplification. + More advanced tutors could exploit the CAS to reduce the + goal to compact normal forms \cite{maplemodeforCoq}, making the Field tutor + certify the simplification according to the skeptical approach. + + We have many plans for further developing both the \hbugs{} architecture and + our prototype. Interesting results could be obtained + augmenting the informative content of each suggestion. We can for example + modify the broker so that also negative results are sent back to the client. + Those negative suggestions could be reflected in the user interface by + deactivating commands to narrow the choice of tactics available to the user. + This approach could be interesting especially for novice users, but requires + the client to increase their level of trust in the other actors. + + We plan also to add some rating mechanism to the architecture. A first + improvement in this direction could be distinguishing between hints that, when + applied, are able to completely close one or more goals, and + tactics that progress in the proof by reducing one or more goals to new goals: + since the new goals can be false, the user can be forced later on to + backtrack. + + Other heuristics and or measures could be added to rate + hints and show them to the user in a particular order: an interesting one + could be a measure that try to minimize the size of the generated proof, + privileging therefore non-overkilling solutions \cite{ring}. + + We are also considering to follow the \OmegaAnts{} path adding + ``recursion'' to the system so that the proof status resulting from the + application of old hints are cached somewhere and could be used as a starting + point for new hint searches. The approach is interesting, but it represents + a big shift towards automatic theorem proving: thus we must consider if it is + worth the effort given the increasing availability of automation in proof + assistants tactics and the ongoing development of \wss{} based on + already existent and well developed theorem provers. + + Even if not strictly part of the \hbugs{} architecture, the graphical user + interface (GUI) of our prototype needs a lot of improvement if we want + it to be really usable by novices. In particular, a critical issue + is avoiding continuous distractions for the user determined by the hints + that are asynchronously pushed to her. + + Our \wss{} still lack a real integration in the MONET architecture, + since we do not provide the different ontologies to describe our problems, + solutions, queries, and services. In the short term, completing this task + could provide a significative feedback to the MONET consortium and would + enlarge the current set of available MONET actors on the Web. In the long + term, new more intelligent tutors could be developed on top of already + existent MONET \wss{}. + + To conclude, \hbugs{} is a nice experiment meant to understand whether the + current \wss{} technology is mature enough to have a concrete and useful + impact on the daily work of proof-assistants users. So far, only the tutor + that is interfaced with the \helm{} Search-Engine has effectively increased + the productivity of experts users. The usefulness of the tutors developed for + beginners, instead, need further assessment. \begin{thebibliography}{01} -% \bibitem{ALF} The ALF family of proof-assistants:\\ -% {\tt http://www.cs.chalmers.se/ComputingScience/Research/\\Logic/implementation.mhtml} -% -% \bibitem{Coq} The Coq proof-assistant:\\ -% {\tt http://coq.inria.fr/} -% -% \bibitem{FORMAVIE} The Formavie project:\\ -% {\tt http://http://www-sop.inria.fr/oasis/Formavie/} -% -% \bibitem{EHELM} The HELM project:\\ -% {\tt http://www.cs.unibo.it/helm/} -% -% \bibitem{MATHWEB} The MathWeb project:\\ -% {\tt http://www.mathweb.org/} -% -% \bibitem{PCOQ} The PCoq project:\\ -% {\tt http://www-sop.inria.fr/lemme/pcoq/} -% -% \bibitem{HELM} A.Asperti, L.Padovani, C.Sacerdoti Coen, I.Schena. -% Towards a library of formal mathematics. Panel session of -% the 13th International Conference on Theorem Proving in Higher Order Logics (TPHOLS'2000), -% Portland, Oregon, USA. -% -% \bibitem{Ring} S. Boutin. Using reflection to build efficient and certified -% decision procedures. In Martin Abadi and Takahashi Ito, editors, TACS'97, -% volume 1281. LNCS, Springer-Verlag, 1997. -% -% \bibitem{YANNTHESIS} Y.Coscoy. \emph{Explication textuelle de preuves pour le -% Calcul des Constructions Inductives}, PhD. Thesis, Universit\'e de Nice-Sophia -% Antipolis, 2000. -% -% \bibitem{ALFA} T. Hallgreen, Aarne Ranta. An Extensible Proof Text Editor. -% Presented at LPAR2000. -% -% \bibitem{Necula} G. Necula, P. Lee. Safe Kernel Extensions Without Run-Time -% Checking. Presented at OSDI'96, October 1996. -% -% \bibitem{Necula2} G. Necula, P. Lee. Efficient Representation and Validation of Proofs. Presented at LICS'98, June 1998 -% -% \bibitem{Werner} B. Werner. \emph{Une Th\'eorie des Constructions Inductives}, -% PhD. Thesis, Universit\'e Paris VII, May 1994. +\bibitem{ws-glossary} Web Services Glossary, W3C Working Draft, 14 May 2003.\\ + \url{http://www.w3.org/TR/2003/WD-ws-gloss-20030514/} + +\bibitem{wsdlbindings} Web Services Description Language (WSDL) + Version 1.2: Bindings, W3C Working Draft, 24 January 2003.\\ + \url{http://www.w3.org/TR/wsdl12-bindings/} + +\bibitem{ws1}A. Armando, D. Zini. Interfacing Computer Algebra and + Deduction Systems via the Logic Broker Architecture. In Proceedings + of the Eighth Calculemus symphosium, St. Andrews, Scotland, 6--7 August 2000. + +\bibitem{ws2} O. Caprotti. Symbolic Evaluator Service. Project Report of + the MathBrocker Project, RISC-Linz, Johannes Kepler University, Linz, + Austria, May 2002. + +\bibitem{helm} A. Asperti, F. Guidi, L. Padovani, C. Sacerdoti Coen, I. Schena. + Mathematical Knowledge Management in HELM. In Annals of Mathematics and + Artificial Intelligence, 38(1): 27--46, May 2003. + +\bibitem{omegaants1} C. Benzm\"uller, V. Sorge. O-Ants -- An Open Approach + at Combining Interactive and Automated Theorem Proving. In M. Kerber and + M. Kohlhase (eds.), Integration of Symbolic and Mechanized Reasoning, pp. + 81--97, 2000. + +\bibitem{omegaants2} C. Benzm\"uller, M. Jamnik, M. Kerber, V. Sorge. + Agent-based Mathematical Reasoning. In A. Armando and T. Jebelean (eds.), + Electronic Notes in Theoretical Computer Science, (1999) 23(3), Elsevier. + +\bibitem{omega} C. Benzm\"uller, L. Cheikhrouhou, D. Fehrer, A. Fiedler, + X. Huang, M. Kerber, M. Kohlhase, K. Konrad, E. Melis, A. Meier, + W. Schaarschmidt, J. Siekmann, V. Sorge. OMEGA: Towards a Mathematical + Assistant. In W. McCune (ed), Proceedings of the 14th Conference on + Automated Deduction (CADE-14), Springer LNAI vol. 1249, pp. 252--255, + Townsville, Australia, 1997. + +\bibitem{ringboutin} S. Boutin. Using reflection to build efficient and + certified decision procedures. In Martin Abadi and Takahashi Ito, editors, + TACS'97, volume 1281. LNCS, Springer-Verlag, 1997. + +\bibitem{maplemodeforCoq} David Delahaye, Micaela Mayero. + A Maple Mode for Coq. Contribution to the Coq library.\\ + \url{htpp://coq.inria.fr/contribs/MapleMode.html} + +\bibitem{MONET-Overview} The MONET Consortium, MONET Architecture Overview, + Public Deliverable D04 of the MONET Project.\\ + \url{http://monet.nag.co.uk/cocoon/monet/publicsdocs/monet-overview.pdf} + +\bibitem{mowglicic} C. Sacerdoti Coen. Exportation Module, MoWGLI Deliverable + D2.a.\\ + \url{http://mowgli.cs.unibo.it/html\_no\_frames/deliverables/transformation/d2a.html} + +\bibitem{ring} C. Sacerdoti Coen. Tactics in Modern Proof-Assistants: the + Bad Habit of Overkilling. In Supplementary Proceedings of the 14th + International Conference TPHOLS 2001, pp. 352--367, Edinburgh. + +\bibitem{zack} S. Zacchiroli. \emph{Web services per il supporto alla + dimostrazione interattiva}, Master Thesis, University of Bologna, 2002. + +\bibitem{ws3} J. Zimmer and M. Kohlhase. System Description: The MathWeb + Software Bus for Distributed Mathematical Reasoning. + In Proceedings of the 18th International Conference on Automated Deduction + CADE 18, LNAI 2392, Springer Verlag, 2002. + +\bibitem{ws4} R. Zippel. The MathBus. In Workshop on Internet Accessible + Mathematical Computation at ISSAC'99, Vancouver, Canada, July 28--31, 1999. \end{thebibliography}