1 \documentclass[runningheads]{llncs}
7 % \myincludegraphics{filename}{place}{width}{caption}{label}
8 \newcommand{\myincludegraphics}[5]{
11 \includegraphics[width=#3]{eps/#1.eps}
19 \usepackage{draftstamp}
21 \newcommand{\musing}{\texttt{musing}}
22 \newcommand{\musings}{\texttt{musings}}
23 \newcommand{\ws}{Web-Service}
24 \newcommand{\wss}{Web-Services}
25 \newcommand{\hbugs}{H-Bugs}
26 \newcommand{\helm}{HELM}
27 \newcommand{\Omegapp}{$\Omega$mega}
28 \newcommand{\OmegaAnts}{$\Omega$mega-Ants}
30 \title{Brokers and Web-Services for Automatic Deduction: a Case Study}
32 \author{Claudio Sacerdoti Coen \and Stefano Zacchiroli}
35 Department of Computer Science\\
36 University of Bologna\\
37 Via di Mura Anteo Zamboni 7, 40127 Bologna, ITALY\\
38 \email{sacerdot@cs.unibo.it}
40 Department of Computer Science\\
41 \'Ecole Normale Sup\'erieure\\
42 45, Rue d'Ulm, F-75230 Paris Cedex 05, FRANCE\\
43 \email{zack@cs.unibo.it}
53 We present a planning broker and several Web-Services for automatic deduction.
54 Each Web-Service implements one of the tactics usually available in an
55 interactive proof-assistant. When the broker is submitted a "proof status" (an
56 unfinished proof tree and a focus on an open goal) it dispatches the proof to
57 the Web-Services, collects the successfull results, and send them back to the
58 client as "hints" as soon as they are available.
60 In our experience this architecture turns out to be helpful both for
61 experienced users (who can take benefit of distributing heavy computations)
62 and beginners (who can learn from it).
65 \section{Introduction\ednote{Zack: attualmente sembra che posto ne abbiamo: se
66 la situazione non muta un 10 righe su \helm{} possiamo metterle, non tanto per
67 la conferenza quanto per i posteri}}
68 The \ws{} approach at software development seems to be a working solution for
69 getting rid of a wide range of incompatibilities between communicating
70 software applications. W3C's efforts in standardizing related technologies
71 grant longevity and implementations availability for frameworks based on
72 \wss{} for information exchange. As a direct conseguence, the number of such
73 frameworks is increasing and the World Wide Web is moving from a disorganized
74 repository of human-understandable HTML documents to a disorganized repository
75 of applications working on machine-understandable XML documents both for input
78 The big challenge for the next future is to provide stable and reliable
79 services over this disorganized, unreliable and ever-evolving architecture.
80 The standard solution \ednote{zack: buhm! :-P} is providing a further level of
81 stable services (called \emph{brokers}) that behave as common gateway/address
82 for client applications to access a wide variety of services and abstract over
85 \oldpart Since the \emph{Declaration of Linz}, the MONET Consortium \cite{MONET},
86 following the guidelines \ednote{guidelines non e' molto appropriato, dato che
87 il concetto di broker non e' definito da W3C e co} of the \wss{}/brokers
88 approach, is working on the development of a framework aimed at providing a
89 set of software tools for the advertisement and discovering of mathematical
91 \begin{newpart}Since the \emph{Declaration of Linz}, the MONET Consortium
92 \cite{MONET}, is working on the development of a framework, based on the
93 \wss{}/brokers approac, aimed at providing a set of software tools for the
94 advertisement and the discovery of mathematical \wss{}.\end{newpart}
95 %CSC This framework turns out to be strongly based on both \wss{} and brokers.
97 Several groups have already developed \wss{} providing both computational and
98 reasoning capabilities \cite{???,???,???}\ednote{trovare dei puntatori carini
99 dalle conferenze calculemus}: the formers are implemented on top of
100 Computer Algebra Systems; the latters provide interfaces to well-known
101 theorem provers. Proof-planners, proof-assistants, CAS and
102 domain-specific problem solvers are natural candidates to be client of these
103 services. Nevertheless, so far the number of examples in the literature has
104 been extremely low and the concrete benefits are still to be assessed.
106 In this paper we present an architecture, namely \hbugs{}, implementing a
107 \emph{suggestion engine} for the proof assistant developed on behalf of the
108 \helm{} project. We provide several \wss{} (called \emph{tutors}) able to
109 suggest possible ways to proceed in a proof. The tutors are orchestrated
110 by a broker (a \ws{} itself) that is able to distribute a proof
111 status from a client (the proof-assistant) to the tutors;
112 each tutor try to make progress in the proof and, in case
113 of success, notify the client that shows an \emph{hint} to the user.
114 Both the broker and the tutors are instances of the homonymous entities of
117 A precursor of \hbugs{} is the \OmegaAnts{} project \cite{???},
118 which provided similar functionalities to the
119 \Omegapp{} proof-planner \cite{Omega}. The main architectural difference
120 between \hbugs{} and \OmegaAnts{} are that the latter is based on a
121 black-board architecture and it is not implemented using \wss{} and
122 brokers. Other differences will be detailed in Sect. \ref{conclusions}.
124 In Sect. \ref{architecture} we present the architecture of \hbugs{}.
125 Further implementation details are given in Sect. \ref{implementation}.
126 Sect. \ref{tutors} is an overview of the tutors that have been implemented.
127 As usual, the paper ends with the conclusions and future works.
130 {CSC: Non so se/dove mettere queste parti.
131 Zack: per ora facciamo senza e vediamo se/quanto spazio abbiamo, la prima parte
132 non e' molto utile, ma la seconda sugli usi tipici di proof assistant
135 Despite of that the proof assistant case seems to be well suited to
136 investigate the usage of many different mathematical \wss{}. Indeed: most
137 proof assistants are still based on non-client/server architectures, are
138 application-centric instead of document-centric, offer a scarce level of
139 automation leaving entirely to the user the choice of which macro (usually
140 called \emph{tactic}) to use in order to make progress in a proof.
142 The average proof assistant can be, for example, a client of a \ws{}\
143 interfacing a specific or generic purpose theorem prover, or a client of a
144 \ws{} interfacing a CAS to simplify expressions in a particular mathematical
148 \section{An \hbugs{} Bird'S Eye View\ednote{Zack: sono in vena di boiate
151 \myincludegraphics{arch}{t}{8cm}{\hbugs{} architecture}{\hbugs{} architecture}
153 The \hbugs{} architecture (depicted in Fig. \ref{arch}) is based on three
154 different kinds of actors: \emph{clients}, \emph{brokers}, and \emph{tutors}.
155 Each actor present one or more \ws{} interfaces to its neighbours \hbugs{}
158 In this section we will detail the role and requirements of each kind of
159 actors and discuss about the correspondencies between them and the MONET
160 entities described in \cite{MONET-Overview}.
163 An \hbugs{} client is a software component able to produce \emph{proof
164 status} and to consume \emph{hints}.
166 A proof status is a representation of an incomplete proof and is supposed to
167 be informative enough to be used by an interactive proof assistant. No
168 additional requirements exist on the proof status, but there should be an
169 agreement on its format between clients and tutors. An hint is a
170 representation\ednote{CSC: non c'\'e un sinonimo pi\'u carino? Zack: l'unico
171 decente sembra essere nuovamente "suggestion"}
172 of a step that can be performed in order to proceed in an
173 incomplete proof. Usually it represents a reference to a tactic available
174 on some proof assistant along with an instantiation for its formal
175 parameters. More structured hints can also be used: an hint can be
176 as complex as a whole proof-plan.
178 \myincludegraphics{interfaces}{t}{8cm}{\hbugs{} \wss{} interfaces}{\hbugs{}
181 Using W3C's terminology \cite{ws-glossary}, clients act both as \ws{}
182 provider and requester, see \ednote{Zack: va bene "see"?, "cfr" credo sia
183 solo italiano ...} Fig. \ref{interfaces}.
184 They act as providers for the broker (to receive hints)
185 and as requesters (to submit new status) \ednote{Zack: non manca il "for"
186 anche per i requesters?}. Clients
187 additionally use broker service to know which tutors are available and to
188 subscribe to one or more of them.
190 Usually, when the role of client is taken by an interactive proof assistant,
191 new status are sent to the broker as soon as the proof change (e.g. when the
192 user applies a tactic or when a new proof is started) and hints are shown to
193 the user be the means of some effect in the user interface (e.g. popping a
194 dialog box or enlightening a tactic button).
196 \hbugs{} clients act as MONET clients and ask brokers to provide access to a
197 set of services (the tutors). \hbugs{} has no actors corresponding to
198 MONET's Broker Locating Service (since the client is supposed to know the
199 URI of at least one broker) and Mathematical Object Manager (since proof
200 status are built on the fly and not stored).
203 Brokers are the key actors of the \hbugs{} architecture since they
204 act as intermediaries between clients and tutors. They behave as \wss{}
205 providers and requesters for \emph{both} clients and tutors, see Fig.
208 With respect to client, a broker act as \ws{} provider, receiving the
209 proof status and forwarding it to one or more tutors.
210 It also acts as a \ws{} requester sending
211 hints to client as soon as they are available from the tutors.
213 With respect to tutors, the \ws{} provider role is accomplished by receiving
214 hints as soon as they are produced; as a requester one is accomplished
215 by requesting computations (\emph{musings} in \hbugs{} terminology) on
216 status received by clients and by stopping already late but still
218 \ednote{CSC: Sta frase va comunque riscritta perch\'e non si capisce una
219 mazza. Zack: io intanto ho aggiunto una figura, vorrei pero' un tuo commento
220 sulla utilita'/quantita' delle figure ...}
222 Additionally brokers keep track of available tutors and clients
225 \hbugs{} brokers act as MONET brokers implementing the following components:
226 Client Manager, Service Registry Manager (keeping track of available
227 tutors), Planning Manager (chosing the available tutors among the ones to
228 which the client is subscribed), Execution Manager. \ednote{non e' chiaro se
229 in monet le risposte siano sincrone o meno, se cosi' fosse dobbiamo
230 specificare che nel nostro caso non lo sono}
233 Tutors are software component able to consume proof status producing hints.
234 \hbugs{} doesn't specify by which means hints should be produced: tutors can
235 use any means necessary (heuristics, external theorem prover or CAS, ...).
236 The only requirement is that exists an agreement on the formats of proof
239 Tutors act both as \ws{} providers and requesters for the broker. As
240 providers, they wait for commands requesting to start a new \musing{} on
241 a given proof status or to stop an old, out of date, \musing{}. As
242 requesters, they signal to the broker the end of a \musing{} along with its
243 outcome (an hint in case of success or a notification of failure).
245 \hbugs{} tutors act as MONET services.
247 \section{Implementation's Highlights}
248 \label{implementation}
249 \ednote{Zack: l'aspetto grafico di questa parte e' un po' peso, possiamo
250 aggiungere varie immagini volendo, e.g.: schema dei thread di un tutor, sample
251 code di un tutor generato automaticamente}
252 In this section we present some of the most relevant implementation details of
253 the \hbugs{} architecture.
256 \paragraph{Proof status}
257 In our implementation of the \hbugs{} architecture we used the proof
258 assistant of the \helm{} project (codename ``gTopLevel'') as an \hbugs{}
259 client. Thus we have implemented serialization/deserialization capabilities
260 fot its internal status. In order to be able to describe \wss{} that
261 exchange status in WSDL using the XML Schema type system, we have chosen an
262 XML format as the target format for the serialization.
264 A schematic representation of the gTopLevel internal status is depicted in
265 Fig. \ref{status}. Each proof is representated by a tuple of four elements:
266 \emph{uri}, \emph{metasenv}, \emph{proof}, \emph{thesis}.
268 \myincludegraphics{status}{t}{8cm}{gTopLevel proof status}{gTopLevel proof
272 \item[uri]: an URI chosen by the user at the beginning of the proof
273 process. Once (and if) proved, that URI will globally identify the term
274 inside the \helm{} library (given that the user decides to save it).
275 \item[thesis]: the thesis of the ongoing proof
276 \item[proof]: the current incomplete proof tree. It can contain
277 \emph{metavariables} (holes) that stands for the parts of the proof
278 that are still to be completed. Each metavariables appearing in the
279 tree references one element of the metavariables environment
281 \item[metasenv]: the metavariables environment is a list of
282 \emph{goals} (unproved conjectures).
283 In order to complete the proof, the user has to instantiate every
284 metavariable in the proof with a proof of the corresponding goal.
285 Each goal is identified by an unique identifier and has a context
286 and a type ( the goal thesis). The context is a list of named
287 hypotheses (declarations and definitions). Thus the context and the goal
288 thesis form a sequent, which is the statement of the proof that will
289 be used to instatiate the metavariable occurrences.
292 Each of these information is represented in XML as described in
293 \cite{csc-thesis}. Additionally, an \hbugs{} status carry the unique
294 identifier of the current goal, which is the goal the user is currently
295 focused on. Using this value it is possible to implement different client
296 side strategies: the user could ask the tutors to work on the goal
297 she is considering or to work on the other ``background'' goals.
300 An hint in the \hbugs{} architecture should carry enough information to
301 permit the client to progress in the current proof. In our
302 implementation each hint corresponds to either one of the tactics available
303 to the user in gTopLevel (together with its actual arguments) or a set
304 of alternative suggestions (a list of hints).
306 For tactics that don't require any particular argument (like tactics that
307 apply type constructors or try to automatically conclude equality goals)
308 only the tactic name is represented in the hint. For tactics that need terms
309 as arguments (for example the \emph{Apply} tactic) the hint includes a
310 textual representation of them, using the same representation used by the
311 interactive proof assistant when querying user for terms. In order to be
312 trasmitted between \wss, hints are serialized in XML.
314 Actually it is also possible for a tutor to return more hints at once,
315 grouping them in a particular XML element.
316 This feature turns out to be particulary useful for the
317 \emph{searchPatternApply} tutor (Sect. \ref{tutors}) that
318 query a term database and return to the client a list of all terms that
319 could be used to complete the proof. This particular hint is encoded as a
320 list of Apply hints, each of them having one of the results as term
323 We would like to stress that the \hbugs{} architecture has no dependency
324 on either the hint or the status representation: the only message parts
325 that are fixed are those representing the administrative messages
326 (the envelops in the \wss terminology). In particular, the broker can
327 manage at the same time several sessions working on different status/hints
328 formats. Of couse, there must be an agreement between the clients
329 and the tutors on the format of the data exchanged.
331 In our implementation the client does not trust the tutors' hints:
332 being encoded as references to available tactics imply
333 that an \hbugs{} client, on receipt of an hint, simply try to reply the work
334 done by a tutor on the local copy of the proof. The application of the hint
335 can even fail to type check and the client copy of the proof can be left
336 undamaged after spotting the error. Note, however, that it is still
337 possible to implement a complex tutor that looks for a possible proof and
338 send back to the client an hint whose argument is a witness (a trace) of
339 the proof found: the client applies the hint reconstructing (and checking
340 the correctness of) the proof from the witness, without having to
341 re-discover the proof itself.
343 An alternative implementation where the tutors are trusted would simply
344 send back to the client a new proof-status. Upong receiving the
345 proof-status, the client would just override its current proof status with
346 the suggested one. In the case of those clients which are implemented
347 using proof-objects (as the Coq proof-assistant, for instance), it is
348 still possible for the client to type-check the proof-object and reject
349 wrong hints. The systems that are not based on proof-objects
350 (as PVS, NuPRL, etc.), instead, have to trust the new proof-status. In this
351 case the \hbugs{} architecture needs to be extended with clients-tutors
354 \paragraph{Registries}
355 Being central in the \hbugs{} architecture, the broker is also responsible
356 of accounting operations both for clients and tutors. These operations are
357 implemented using three different data structures called \emph{registries}:
358 clients registry, tutors registry and \musings{} registry.
360 In order to use the suggestion engine a client should register itself to the
361 broker and subscribe to one or more tutors. The registration phase is
362 triggered by the client using the \texttt{Register\_client} method of the
363 broker to send him an unique identifier and its base URI as a
364 \ws{}. After the registration, the client can use broker's
365 \texttt{List\_tutors} method to get a list of available tutors.
366 Eventually\ednote{CSC: Vuoi veramente dire eventually qui? Zack: si, prima o
367 poi lo faranno ...} the
368 client can subscribe to one or more of these using broker's \emph{Subscribe}
369 method. Clients can also unregister from brokers using
370 \texttt{Unregister\_client} method.
372 The broker keeps track of both registered clients and clients' subscriptions
373 in the clients registry.
375 In order to be advertised to clients during the subscription phase, tutors
376 should register to the broker using broker's \texttt{Register\_tutor}
377 method. This method is really similar to the \texttt{Register\_client}
378 one: tutors are required to send an unique identify and a base URI for
380 Additionally tutors are required to send an human readable description of
381 their capabilities; this information could be used by client's user to
382 decide which tutors he needs to subscribe to. Like clients, tutors can
383 unregister from brokers using \texttt{Unregister\_broker} method.
385 Track of available tutors is kept in the tutors
386 registry.\ednote{Non si mette mai un paragrafo lungo meno di una linea!}
388 Each time the client status change, the status is sent to the
389 broker using its \emph{Status} method. Using both clients registry (to
390 lookup client's subscription) and tutors registry (to check if some tutors
391 has unsubscribed), the broker is able to decide to which tutors the
392 new status must be forwarded.\ednote{CSC: qui o nei lavori futuri parlare
393 della possibilit\'a di avere un vero brocker che multiplexi le richieste
394 del tutor localizzando i servizi, etc.}
396 The forwarding operation is performed using tutors' \texttt{Start\_musing}
397 method, that is a request to start a new computation (\emph{\musing{}}) on a
398 given status. The return value for \texttt{Start\_musing} method is a
399 \musing{} identifier that is saved in the \musings{} registry along with
400 the identifier of the client that triggered the starting of the \musing{}.
402 As soon as a \musing{} is completed, the owning tutor informs the broker
403 using its \texttt{Musing\_completed} method; the broker can now remove the
404 \musing{} entry from the \musings{} registry and, depending on its outcome,
405 inform the client. In case of success one of the \texttt{Musing\_completed}
406 arguments is an hint to be sent to the client, otherwise there's no need to
407 inform him and the \texttt{Musing\_completed} method is called
408 just to update the \musings{} registry.
410 Consulting the \musings{} registry, the tutor\ednote{CSC: ma \'e vero o
411 stai delirando? Zack: e' vero, non ti fidi? :-) Up to delay di rete
412 ovviamente ...} is able to know, at each time,
413 which \musings{} are in execution on which tutor. This peculiarity is
414 exploited by the broker on invocation of Status method. Receiving a new
415 status from the client imply indeed that the previous status no longer
416 exists and all \musings{} working on it should be stopped: additionally to
417 the already described behaviour (i.e. starting new \musings{} on the
418 received status), the tutor takes also care of stopping ongoing computation
419 invoking \texttt{Stop\_musing} tutors' method.
422 As already discussed, all \hbugs{} actors act as \wss{} offering one or more
423 services to neighbour actors. To grant as most accessibility as possible to
424 our \wss{} we have chosen to bind them using the HTTP/POST bindings
425 described in \cite{????}\footnote{Given that our proof assistant was
426 entirely developed in the Objective Caml language, we have chosen to
427 develop also \hbugs{} in that language in order to maximize code reuse. To
428 develop \wss{} in Objective Caml we have developed an auxiliary generic
429 library (\emph{O'HTTP}) that can be used to write HTTP 1.1 web servers and
430 abstract over GET/POST parsing. This library supports different kinds of web
431 servers architecture, including multi-process and multi-threaded ones.}.
434 Each tutor expose a \ws{} interface and should be able to work, not only for
435 many different clients referring to a common broker, but also for many
436 different brokers. The potential high number of concurrent clients imposes
437 a multi-threaded or multi-process architecture.
439 Our current implementation is based on a multi threaded architecture
440 exploiting the capabilities of the O'HTTP library. Each tutor is composed
441 by two thread always running plus
442 an additional thread for each running \musing{}. One thread is devoted to
443 listening for incoming \ws{} request; upon correct receiving requests it
444 pass the control to the second always-running thread which handle the pool
445 of running \musings{}. When a new \musing{} is requested, a new thread is
446 spawned to work them out; when a request to interrupt an old \musing{} is
447 received, the thread actually running them is killed freeing its
448 resources.\ednote{CSC: A cosa dobbiamo questa architettura delirante? Se non
449 ricordo male al problema dell'uccisione dei thread. Ora o si spiega
450 il motivo di questa architettura o si glissa/bluffa. Zack: cosa ti sembra
451 delirante? che i thread vengono uccisi? ... non mi e' molto chiaro ...}
453 This architecture turns out to be scalable and allows the running threads
454 to share the cache of loaded (and type-checked) theorems.
455 As we will explain in Sect. \ref{tutors}, this feature turns out to be
456 really useful for tactics that rely on a huge but fixed set of lemmas,
457 as every reflexivite tactic.
459 The implementation of a tutor with the described architecture is not that
460 difficult having a language with good threading capabilities (as OCaml has)
461 and a pool of already implemented tactics (as gTopLevel has).
462 Still working with threads is known to be really error prone due to
463 concurrent programming intrinsic complexity. Moreover, there is a
464 non-neglectable part of code that needs to be duplicated in every tutor:
465 the code to register the tutor to the broker and to handle HTTP requests;
466 the code to manage the creation and termination of threads; and the code for
467 parsing the requests and serializing the answers. As a consequence we
468 have written a generic implementation of a tutor which is parameterized
469 over the code that actually propose the hint and some administrative
470 data (as the port the tutor will be listening to).
472 The generic tutor skeleton is really helpful in writing new tutors.
473 Nevertheless, the code obtained by converting existing tactics into tutors
474 is still quite repetitive: every tutor that wraps a tactic has to
475 instantiate its own copy of the proof-engine kernel and, for each request,
476 it has to override its status, guess the tactic arguments, apply the tactic
477 and, in case of success, send back an hint with the tactic name and the
478 chosen arguments. Of course, the complex part of the work is guessing the
479 right arguments. For the simple case of tactics that do not require
480 any argument, though, we are able to automatically generate the whole
481 tutor code given the tactic name. Concretely, we have written a
482 tactic-based tutor template and a script that parses an XML file with
483 the specification of the tutor and generates the tutor's code.
484 The XML file holds the tutor's port, the code to invoke the tactic,
485 the hint that is sent back upon successfull application and a
486 human readable explanation of the tactic implemented by the tutor.
488 \section{The Implemented \hbugs Tutors}
490 To test the \hbugs{} architecture and to assess the utility of a suggestion
491 engine for the end user, we have implemented several tutors. In particular,
492 we have investigated three classes of tutors:
494 \item \emph{Tutors for beginners}. These are tutors that implement tactics
495 which are neither computationally expensive nor difficult to understand:
496 an expert user can always understand if the tactic can be applied or not
497 withouth having to try it. For example, the following implemented tutors
498 belong to this class:
500 \item \emph{Assumption Tutor}: it ends the proof if the thesis is
501 equivalent\footnote{In our implementation, the equivalence relation
502 imposed by the logical framework is \emph{convertibility}. Two
503 expressions are convertible when they reduce to the same normal form.
504 Two ``equal'' terms depending on free variables can be non-convertible
505 since free variables stop the reduction. For example, $2x$ is convertible
506 with $(3-1)x$ because they both reduce to the same normal form
507 $x + x + 0$; but $2x$ is not convertible to $x2$ since the latter is
508 already in normal form.}
509 to one of the hypotheses\footnote{
510 In some cases, expecially when non-trivial computations are involved,
511 the user is totally unable to figure out the convertibility of two terms.
512 In these cases the tutor becomes handy also for expert users.}.
513 \item \emph{Contradiction Tutor}: it ends the proof by \emph{reductio ad
514 adsurdum} if one hypothesis is equivalent to $False$.
515 \item \emph{Symmetry Tutor}: if the goal thesis is an equality, it
516 suggests to apply the commutative property.
517 \item \emph{Left/Right/Exists/Split/Reflexivity/Constructor Tutors}:
518 the Constructor Tutor suggests to proceed in the proof by applying one
519 or more constructors when the goal thesis is an inductive type or a
520 proposition inductively defined according to the declarative
521 style\footnote{An example of a proposition that can be given in
522 declarative style is the $\le$ relation: $\le$ is the smallest relation
523 such that $n \le n$ for every $n$ and $n \le m$ for every $n,m$ such
524 that $n \le p$ where $p$ is the predecessor of $m$. Thus, a proof
525 of $n \le n$ is simply the application of the first constructor to
526 $n$ and a proof of $n \le m$ is the application of the second
527 constructor to $n,m$ and a proof of $n \le m$.}.
528 Since disjunction, conjunction, existential quantification and
529 Leibniz equality are particular cases of inductive propositions,
530 all the other tutors of this class are restrictions of the
531 the Constructor tactic: Left and Right suggest to prove a disjunction
532 by proving its left/right member; Split reduces the proof of a
533 conjunction to the two proof of its members; Exists suggests to
534 prove an existential quantification by providing a
535 witness\footnote{This task is left to the user.}; Reflexivity proves
536 an equality whenever the two sides are convertible.
538 Beginners, when first faced with a tactic-based proof-assistant, get
539 lost quite soon since the set of tactics is large and their names and
540 semantics must be remembered by heart. Tutorials are provided to guide
541 the user step-by-step in a few proofs, suggesting the tactics that must
542 be used. We believe that our beginners tutors can provide an auxiliary
543 learning tool: after the tutorial, the user is not suddendly left alone
544 with the system, but she can experiment with variations of the proof given
545 in the tutorial as much as she like, still getting useful suggestions.
546 Thus the user is allowed to focus on learning how to do a formal proof
547 instead of wasting efforts trying to remember the interface to the system.
548 \item{Tutors for Computationally Expensive Tactics}. Several tactics have
549 an unpredictable behaviour, in the sense that it is unfeasible to understand
550 wether they will succeed or they will fail when applied and what will be
551 their result. Among them, there are several tactics either computationally
552 expensive or resources consuming. In the first case, the user is not
553 willing to try a tactic and wait for a long time just to understand its
554 outcome: she would prefer to keep on concentrating on the proof and
555 have the tactic applied in background and receive out-of-band notification
556 of its success. The second case is similar, but the tactic application must
557 be performed on a remote machine to avoid overloading the user host
558 with several concurrent resource consuming applications.
560 Finally, several complex tactics and in particular all the tactics based
561 on reflexive techniques depend on a pretty large set of definitions, lemmas
562 and theorems. When these tactics are applied, the system needs to retrieve
563 and load all the lemmas. Pre-loading all the material needed by every
564 tactic can quickly lead to long initialization times and to large memory
565 footstamps. A specialized tutor running on a remote machine, instead,
566 can easily pre-load the required theorems.
568 As an example of computationally expensive task, we have implemented
569 a tutor for the \emph{Ring} tactic. The tutor is able to prove an equality
570 over a ring by reducing both members to a common normal form. The reduction,
571 which may require some time in complex cases,
572 is based on the usual commutative, associative and neutral element properties
573 of a ring. The tactic is implemented using a reflexive technique, which
574 means that the reduction trace is not stored in the proof-object itself:
575 the type-checker is able to perform the reduction on-the-fly thanks to
576 the conversion rules of the system. As a consequence, in the library there
577 must be stored both the algorithm used for the reduction and the proof of
578 correctness of the algorithm, based on the ring axioms. This big proof
579 and all of its lemmas must be retrieved and loaded in order to apply the
580 tactic. The Ring tutor loads and cache all the required theorems the
581 first time it is conctacted.
582 \item{Intelligent Tutors}. Expert users can already benefit from the previous
583 class of tutors. Nevertheless, to achieve a significative production gain,
584 they need more intelligent tutors implementing domain-specific theorem
585 provers or able to perform complex computations. These tutors are not just
586 plain implementations of tactics or decision procedures, but can be
587 more complex software agents interacting with third-parties software,
588 such as proof-planners, CAS or theorem-provers.
590 To test the productivity impact of intelligent tutors, we have implemented
591 a tutor that is interfaced with the HELM
592 Proof-Engine\footnote{\url{http://mowgli.cs.unibo.it/library.html}} and that
593 is able to look for every theorem in the distributed library that can
594 be applied to proceed in the proof. Even if the tutor deductive power
595 is extremely limited\footnote{We do not attempt to check if the new goals
596 obtained applying a lemma can be authomatically proved or, even better,
597 auhomatically disproved to reject the lemma.}, it is not unusual for
598 the tutor to come up with precious hints that can save several minutes of
599 work that would be spent in proving again already proven results or
600 figuring out where the lemmas could have been stored in the library.
603 \section{Conclusions and Future Work\ednote{Zack: ho scritto la parte "future
606 We have many plan for further developing both the \hbugs{} architecture and
607 our prototype implementing them. Interesting results could be obtained
608 augmenting the informative content of each suggestion. We can for example
609 modify the broker so that also negative results are sent back to the client.
610 Those negative suggestions could be reflected in the user interface by
611 deactivating command narrowing the choice of tactics available to the user.
612 This approach could be interesting expecially for novice users, but require
613 the client to trust other actors a bit more than in the current approach.
615 We plan also to add some rating mechanism to the architecture. A first
616 improvement in this direction could be to distinguish between hints that, when
617 applied, are able to close one or more\ednote{Zack: e' possibile?} goals and
618 tactics that doesn't. Other heuristics and/or measures could be added to rate
619 hints and show them to the user in a particular order: an interesting one
620 could be a measure that try to minimize the size of the generated proof,
621 privileging therefore non-overkilling solutions\ednote{Zack: qua se vuoi ti
622 puoi pure auto-citare, cosa vuoi di piu'?}
624 We are also considering to follow the \OmegaAnts{} path more closely adding
625 ``recursion'' to the system so that proof status resulting from the
626 application of old hints are cached somewhere and could be used as a starting
627 point for new hint searches. Tough the approach is interesting, it moves the
628 focus closer to automatic theorem proving and we are considering if its worth
629 the effort given the increasing availability of automation in proof
632 Even if not strictly part of the \hbugs{} architecture, the graphical user
633 interface (GUI) of our prototype needs a lot of improvement if we would like
634 it to be usable by novices.\ednote{Zack: la parte della GUI non ci sta molto
635 bene, sembra che vogliamo fare soldi vendendo HBugs ... forse va solo
636 formulata meglio ...}
638 Finally we hardly believe that \wss{} such our brokers and tutors could be
639 used also as components uncoupled from our client, exploiting their
640 capabilities is for example possible to develop a web-based proof assistant
641 demonstrative application. In order to widen even more their accessibility we
642 plan to write MONET descriptions \ednote{Zack: controllare se esiste un
643 termine piu' preciso} of the \wss{} offered by \hbugs{} actors. For the same
644 reasons we plan also to add support for Mathematical Object Managers both for
645 store and retrieve proof status.
647 \begin{thebibliography}{01}
649 % \bibitem{ALF} The ALF family of proof-assistants:\\
650 % {\tt http://www.cs.chalmers.se/ComputingScience/Research/\\Logic/implementation.mhtml}
652 % \bibitem{Coq} The Coq proof-assistant:\\
653 % {\tt http://coq.inria.fr/}
655 % \bibitem{FORMAVIE} The Formavie project:\\
656 % {\tt http://http://www-sop.inria.fr/oasis/Formavie/}
658 % \bibitem{EHELM} The HELM project:\\
659 % {\tt http://www.cs.unibo.it/helm/}
661 % \bibitem{MATHWEB} The MathWeb project:\\
662 % {\tt http://www.mathweb.org/}
664 % \bibitem{PCOQ} The PCoq project:\\
665 % {\tt http://www-sop.inria.fr/lemme/pcoq/}
667 % \bibitem{HELM} A.Asperti, L.Padovani, C.Sacerdoti Coen, I.Schena.
668 % Towards a library of formal mathematics. Panel session of
669 % the 13th International Conference on Theorem Proving in Higher Order Logics (TPHOLS'2000),
670 % Portland, Oregon, USA.
672 % \bibitem{Ring} S. Boutin. Using reflection to build efficient and certified
673 % decision procedures. In Martin Abadi and Takahashi Ito, editors, TACS'97,
674 % volume 1281. LNCS, Springer-Verlag, 1997.
676 % \bibitem{YANNTHESIS} Y.Coscoy. \emph{Explication textuelle de preuves pour le
677 % Calcul des Constructions Inductives}, PhD. Thesis, Universit\'e de Nice-Sophia
680 % \bibitem{ALFA} T. Hallgreen, Aarne Ranta. An Extensible Proof Text Editor.
681 % Presented at LPAR2000.
683 % \bibitem{Necula} G. Necula, P. Lee. Safe Kernel Extensions Without Run-Time
684 % Checking. Presented at OSDI'96, October 1996.
686 % \bibitem{Necula2} G. Necula, P. Lee. Efficient Representation and Validation of Proofs. Presented at LICS'98, June 1998
688 % \bibitem{Werner} B. Werner. \emph{Une Th\'eorie des Constructions Inductives},
689 % PhD. Thesis, Universit\'e Paris VII, May 1994.
691 \end{thebibliography}