X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2Fpapers%2Fmatita%2Fmatita2.tex;fp=helm%2Fpapers%2Fmatita%2Fmatita2.tex;h=0000000000000000000000000000000000000000;hb=85747dc6d0578b484544bb8120aad7aa89813f27;hp=a0c014251979413ea0b1edaba8e0cd26f52b47d0;hpb=c1986639552e01334a05db4236627a6c1ffacf21;p=helm.git diff --git a/helm/papers/matita/matita2.tex b/helm/papers/matita/matita2.tex deleted file mode 100644 index a0c014251..000000000 --- a/helm/papers/matita/matita2.tex +++ /dev/null @@ -1,1807 +0,0 @@ -\documentclass[]{kluwer} -\usepackage{color} -\usepackage{graphicx} -\usepackage{hyperref} -\usepackage{color} -\usepackage{fancyvrb} -\usepackage[show]{ed} - -\newcommand{\component}{component} -\newcommand{\components}{components} - -\newcommand{\BOXML}{BoxML} -\newcommand{\COQ}{Coq} -\newcommand{\GDOME}{Gdome} -\newcommand{\GTK}{GTK+} -\newcommand{\GTKMATHVIEW}{\textsc{GtkMathView}} -\newcommand{\HELM}{Helm} -\newcommand{\HINT}{\textsc{Hint}} -% \newcommand{\IN}{\ensuremath{\dN}} -\newcommand{\LEGO}{Lego} -\newcommand{\MATHML}{MathML} -\newcommand{\MATITA}{Matita} -\newcommand{\MATITAC}{\texttt{matitac}} -\newcommand{\MATITADEP}{\texttt{matitadep}} -\newcommand{\MOWGLI}{MoWGLI} -\newcommand{\MOWGLIIST}{IST-2001-33562} -\newcommand{\NUPRL}{NuPRL} -\newcommand{\OCAML}{OCaml} -\newcommand{\REF}[3]{\ensuremath{\mathit{Ref}_{#1}(#2,#3)}} -\newcommand{\REWRITEHINT}{\textsc{RewriteHint}} -\newcommand{\TEXMACRO}[1]{\texttt{\char92 #1}} -\newcommand{\UWOBO}{UWOBO} -\newcommand{\GETTER}{Getter} -\newcommand{\WHELP}{Whelp} - -\newcommand{\DOT}{\ensuremath{\mbox{\textbf{.}}}} -\newcommand{\SEMICOLON}{\ensuremath{\mbox{\textbf{;}}}} -\newcommand{\BRANCH}{\ensuremath{\mbox{\textbf{[}}}} -\newcommand{\SHIFT}{\ensuremath{\mbox{\textbf{\textbar}}}} -\newcommand{\POS}[1]{\ensuremath{#1\mbox{\textbf{:}}}} -\newcommand{\MERGE}{\ensuremath{\mbox{\textbf{]}}}} -\newcommand{\FOCUS}[1]{\ensuremath{\mathtt{focus}~#1}} -\newcommand{\UNFOCUS}{\ensuremath{\mathtt{unfocus}}} -\newcommand{\SKIP}{\MATHTT{skip}} -\newcommand{\TACTIC}[1]{\ensuremath{\mathtt{tactic}~#1}} - -\newcommand{\NT}[1]{\ensuremath{\langle\mathit{#1}\rangle}} -\newcommand{\URI}[1]{\texttt{#1}} -\newcommand{\OP}[1]{``\texttt{#1}''} -\newcommand{\FILE}[1]{\texttt{#1}} -\newcommand{\TAC}[1]{\texttt{#1}} -\newcommand{\TODO}[1]{\textbf{TODO: #1}} - -\definecolor{gray}{gray}{0.85} % 1 -> white; 0 -> black - -\newenvironment{grafite}{\VerbatimEnvironment - \begin{SaveVerbatim}{boxtmp}}% - {\end{SaveVerbatim}\setlength{\fboxsep}{3mm}% - \begin{center} - \fcolorbox{black}{gray}{\BUseVerbatim[boxwidth=0.9\linewidth]{boxtmp}} - \end{center}} - -\newcounter{pass} -\newcommand{\PASS}{\stepcounter{pass}\arabic{pass}} - -\newsavebox{\tmpxyz} -\newcommand{\sequent}[2]{ - \savebox{\tmpxyz}[0.9\linewidth]{ - \begin{minipage}{0.9\linewidth} - \ensuremath{#1} \\ - \rule{3cm}{0.03cm}\\ - \ensuremath{#2} - \end{minipage}}\setlength{\fboxsep}{3mm}% - \begin{center} - \fcolorbox{black}{gray}{\usebox{\tmpxyz}} - \end{center}} - -\bibliographystyle{klunum} - -\begin{document} - -\begin{opening} - \title{The \MATITA{} Proof Assistant} - - \author{Andrea \surname{Asperti} \email{asperti@cs.unibo.it}} - \author{Claudio \surname{Sacerdoti Coen} \email{sacerdot@cs.unibo.it}} - \author{Enrico \surname{Tassi} \email{tassi@cs.unibo.it}} - \author{Stefano \surname{Zacchiroli} \email{zacchiro@cs.unibo.it}} - - \institute{Department of Computer Science, University of Bologna\\ - Mura Anteo Zamboni, 7 --- 40127 Bologna, ITALY} - - \runningtitle{The \MATITA{} proof assistant} - \runningauthor{Asperti, Sacerdoti Coen, Tassi, Zacchiroli} - -%\begin{motto} -% ``We are nearly bug-free'' -- \emph{CSC, Oct 2005} -% \end{motto} - - \begin{abstract} - \MATITA{} is a new document-centric proof assistant that integrates several - Mathematical Knowledge Management tools and techniques. In this paper - we describe the architecture of \MATITA{} and the peculiarities of its user - interface. - \end{abstract} - - \keywords{Proof Assistant, Mathematical Knowledge Management, XML, Authoring, - Digital Libraries} -\end{opening} - -% toc & co: to be removed in the final paper version -\tableofcontents - -\section{Introduction} -\label{sec:intro} - -\MATITA is the Proof Assistant under development by the \HELM{} -team~\cite{mkm-helm} at the University of Bologna, under the direction of -Prof.~Asperti. \MATITA{} is free software, the source code is available for -download at \url{http://matita.cs.unibo.it}. This paper describes the overall -architecture of the system, focusing on its most distinctive and innovative -features. - -\subsection{Historical perspective} - -The origins of \MATITA{} go back to 1999. At the time we were mostly -interested in developing tools and techniques to enhance the accessibility -via Web of libraries of formalized mathematics. Due to its dimension, the -library of the \COQ~\cite{CoqManual} proof assistant (of the order of -35'000 theorems) -was chosen as a privileged test bench for our work, although experiments -have been also conducted with other systems, and notably -with \NUPRL~\cite{nuprl-book}. -The work, mostly performed in the framework of the recently concluded -European project \MOWGLIIST{} \MOWGLI~\cite{pechino}, mainly consisted in the -following steps: -\begin{enumerate} - - \item exporting the information from the internal representation of - \COQ{} to a system and platform independent format. Since XML was at - the time an emerging standard, we naturally adopted that technology, - fostering a content-centric architecture~\cite{content-centric} where - the documents of the library were the the main components around which - everything else has to be built; - - \item developing indexing and searching techniques supporting semantic - queries to the library; - - \item developing languages and tools for a high-quality notational - rendering of mathematical information.\footnote{We have been active in - the \MATHML{} Working group since 1999.} - -\end{enumerate} - -According to our content-centric commitment, the library exported from -\COQ{} was conceived as being distributed and most of the tools were developed -as Web services. The user can interact with the library and the tools by -means of a Web interface that orchestrates the Web services. - -Web services and other tools have been implemented as front-ends -to a set of software components, collectively called the \HELM{} components. -At the end of the \MOWGLI{} project we already disposed of the following -tools and software components: -\begin{itemize} - - \item XML specifications for the Calculus of Inductive Constructions, - with components for parsing and saving mathematical objects in such a - format~\cite{exportation-module}; - - \item metadata specifications with components for indexing and querying the - XML knowledge base; - - \item a proof checker (i.e. the \emph{kernel} of a proof assistant), - implemented to check that we exported from the \COQ{} library all the - logically relevant content; - - \item a sophisticated term parser (used by the search engine), able to deal - with potentially ambiguous and incomplete information, typical of the - mathematical notation~\cite{disambiguation}; - - \item a \emph{refiner} component, i.e. a type inference system, based on - partially specified terms, used by the disambiguating parser; - - \item complex transformation algorithms for proof rendering in natural - language~\cite{remathematization}; - - \item an innovative, \MATHML-compliant rendering widget~\cite{padovani} - for the \GTK{}\footnote{\url{http://www.gtk.org}} graphical environment, - supporting high-quality bidimensional - rendering, and semantic selection, i.e. the possibility to select semantically - meaningful rendering expressions, and to paste the respective content into - a different text area. - -\end{itemize} - -Starting from all this, developing our own proof assistant was not -too far away: essentially, we ``just'' had to -add an authoring interface, and a set of functionalities for the -overall management of the library, integrating everything into a -single system. \MATITA{} is the result of this effort. - -\subsection{The system} - -\MATITA{} is a proof assistant (also called interactive theorem prover). -It is based on the Calculus of (Co)Inductive Constructions -(CIC)~\cite{Werner} that is a dependently typed lambda-calculus \`a la -Church enriched with primitive inductive and co-inductive data types. -Via the Curry-Howard isomorphism, the calculus can be seen as a very -rich higher order logic and proofs can be simply represented and -stored as lambda-terms. \COQ{} and \LEGO~\cite{lego} are other systems -that adopt (variations of) CIC as their foundation. - -The proof language of \MATITA{} is procedural, in the tradition of the LCF -theorem prover~\cite{lcf}. \COQ, \NUPRL, PVS, Isabelle are all examples of -others systems -whose proof language is procedural. Traditionally, in a procedural system -the user interacts only with the \emph{script}, while proof terms are internal -records kept by the system. -In \MATITA{}, which has been conceived for a potentially distributed -library, proof terms are also meant as the primary representation for -storage and communication with other systems (e.g. \COQ). -%On the contrary, in \MATITA{} proof terms are -%praised as declarative versions of the proof. Playing that role, they are the -%primary mean of communication of proofs (once rendered to natural language -%for human audiences). - -All the user interfaces currently adopted by proof assistants based on a -procedural proof language have been influenced by the CtCoq pioneering -system~\cite{ctcoq1}. One successful incarnation of the ideas introduced -by CtCoq is the Proof General generic interface~\cite{proofgeneral}, -that has set a sort of -standard way to interact with the system. -%Several procedural proof assistants -%have either adopted or cloned Proof General as their main user interface. -The authoring interface of \MATITA{} essentially offers the same -functionalities of the Proof General interface or CoqIde. -On the contrary, the interface to interact with the library is rather -innovative and directly inspired by the Web interfaces to our Web servers. - -\MATITA{} is backward compatible with the XML library of proof objects exported -from \COQ{}, but, in order to test the actual usability of the system, we are -also developing a new library of basic results from scratch. - -\subsection{Relationship with \COQ{}} - -At first sight, \MATITA{} looks as (and partly is) a \COQ{} clone. This is -more the effect of the circumstances of its creation described -above than the result of a deliberate design. In particular, we -(essentially) share the same foundational dialect of \COQ{} (the -Calculus of (Co)Inductive Constructions), the same implementation -language (\OCAML\footnote{\url{http://caml.inria.fr/}}), -and the same (procedural, script based) authoring philosophy. -However, the analogy essentially stops here; in particular, no code is shared -between the two systems and the algorithms are often different. - -In a sense, we like to think of \MATITA{} as the way \COQ{} would -look like if entirely rewritten from scratch: just to give an -idea, although \MATITA{} currently supports almost all functionalities of -\COQ{}, it links 60'000 lines of \OCAML{} code, against the 166'000 lines linked -by \COQ{} (and we are convinced that, starting from scratch again, -we could reduce our code even further in a sensible way). - -Moreover, the complexity of the code of \MATITA{} is greatly reduced with -respect to \COQ. For instance, the API of the components of \MATITA{} comprise -989 functions, to be compared with the 4'286 functions of \COQ. - -Finally, \MATITA{} has several innovative features over \COQ{} that derive -from the integration of Mathematical Knowledge Management tools with proof -assistants. Among them, the advanced indexing tools over the library and -the parser for ambiguous mathematical notation. - -The size and complexity improvements over \COQ{} must be understood -historically. \COQ~\cite{CoqArt} is a quite old -system whose development started 20 years ago. Since then, -several developers have took over the code and several new research ideas -and optimizations that were not considered in the original architecture -have been experimented -and integrated in the system. Moreover, there exists a lot of developments -for \COQ{} that require backward compatibility between each pair of releases; -since many central functionalities of a proof assistant are based on heuristics -or arbitrary choices to overcome undecidability (e.g. for higher order -unification), changing these functionalities maintaining backward compatibility -is very difficult. -%Finally, the code of \COQ{} has been greatly optimized -%over the years; optimization reduces maintainability and rises the complexity -%of the code. - -In writing \MATITA{} we have not been hindered by backward compatibility and -we have took advantage of the research results and experiences previously -developed by others, comprising the authors of \COQ. Moreover, starting from -scratch, we have designed in advance the architecture and we have split -the code in coherent minimally coupled components. - -In the future we plan to exploit \MATITA{} as a test bench for new ideas and -extensions. Keeping the single components and the whole architecture as -simple as possible is thus crucial to foster future experiments and to -allow other developers to quickly understand our code and contribute. - -%For direct experience of the authors, the learning curve to understand and -%be able to contribute to \COQ{}'s code is quite steep and requires direct -%and frequent interactions with \COQ{} developers. - -\section{Architecture} -\label{architettura} - -\begin{figure}[!ht] - \begin{center} - \includegraphics[width=0.9\textwidth,height=0.8\textheight]{pics/libraries-clusters} - \caption[\MATITA{} components and related applications]{\MATITA{} - components and related applications, with thousands of line of - codes (klocs)\strut} - \label{fig:libraries} - \end{center} -\end{figure} - -Fig.~\ref{fig:libraries} shows the architecture of the \emph{\components} -(circle nodes) and \emph{applications} (squared nodes) developed in the -\HELM{} project. Each node is annotated with the number of lines of -source code (comprising comments). - -Applications and \components{} depend on other \components{} forming a -directed acyclic graph (DAG). Each \component{} can be decomposed in -a set of \emph{modules} also forming a DAG. - -Modules and \components{} provide coherent sets of functionalities -at different scales. Applications that require only a few functionalities -depend on a restricted set of \components. - -Only the proof assistant \MATITA{} and the \WHELP{} search engine are -applications meant to be used directly by the user; all the other applications -are Web services developed in the \HELM{} and \MOWGLI{} projects. The -following applications have already been described elsewhere: -\begin{itemize} - - \item The \emph{\GETTER}~\cite{zack-master} is a Web service to - retrieve an (XML) document from a physical location (URL) given its - logical name (URI). The Getter is responsible of updating a table that - maps URIs to URLs. Thanks to the Getter it is possible to work on a - logically monolithic library that is physically distributed on the - network. - - \item \emph{\WHELP}~\cite{whelp} is a search engine to index and - locate mathematical concepts (axioms, theorems, definitions) in the - logical library managed by the Getter. Typical examples of - \WHELP{} queries are those that search for a theorem that generalize or - instantiate a given formula, or that can be immediately applied to - prove a given goal. The output of Whelp is an XML document that lists - the URIs of a complete set of candidates that are likely to satisfy - the given query. The set is complete in the sense that no concept that - actually satisfies the query is thrown away. However, the query is - only approximated in the sense that false matches can be returned. - - \item \emph{\UWOBO}~\cite{zack-master} is a Web service that, given the - URI of a mathematical concept in the distributed library, renders it - according to the user provided two dimensional mathematical notation. - \UWOBO{} may also inline the rendering of mathematical concepts into - arbitrary documents before returning them. The Getter is used by - \UWOBO{} to retrieve the document to be rendered. - - \item The \emph{Proof Checker}~\cite{zack-master} is a Web service - that, given the URI of a concept in the distributed library, checks its - correctness. Since the concept is likely to depend in an acyclic way - on other concepts, the proof checker is also responsible of building - in a top-down way the DAG of all dependencies, checking in turn every - concept for correctness. - - \item The \emph{Dependency Analyzer}~\cite{zack-master} is a Web - service that can produce a textual or graphical representation of the - dependencies of a concept. - -\end{itemize} - -The dependency of a \component{} or application over another \component{} can -be satisfied by linking the \component{} in the same executable. -For those \components{} whose functionalities are also provided by the -aforementioned Web services, it is also possible to link stub code that -forwards the request to a remote Web service. For instance, the -\GETTER{} application is just a wrapper to the \GETTER{} \component{} -that allows it to be used as a Web service. \MATITA{} can directly link -the code of the \GETTER{} \component, or it can use a stub library with -the same API that forwards every request to the Web service. - -To better understand the architecture of \MATITA{} and the role of each -\component, we can focus on the representation of the mathematical -information. In CIC terms are used to represent mathematical formulae, -types and proofs. \MATITA{} is able to handle terms at four different -levels of specification. On each level it is possible to provide a -different set of functionalities. The four different levels are: fully -specified terms; partially specified terms; content level terms; -presentation level terms. - -\subsection{Fully specified terms} -\label{sec:fullyintro} - - \emph{Fully specified terms} are CIC terms where no information is - missing or left implicit. A fully specified term should be well-typed. - The mathematical concepts (axioms, definitions, theorems) that are stored - in our mathematical library are fully specified and well-typed terms. - Fully specified terms are extremely verbose (to make type-checking - decidable). Their syntax is fixed and does not resemble the usual - extendible mathematical notation. They are not meant for direct user - consumption. - - The \texttt{cic} \component{} defines the data type that represents CIC terms - and provides a parser for terms stored in XML format. - - The most important \component{} that deals with fully specified terms is - \texttt{cic\_proof\_checking}. It implements the procedure that verifies - if a fully specified term is well-typed. It also implements the - \emph{conversion} judgement that verifies if two given terms are - computationally equivalent (i.e. they share the same normal form). - - Terms may reference other mathematical concepts in the library. - One commitment of our project is that the library should be physically - distributed. The \GETTER{} \component{} manages the distribution, - providing a mapping from logical names (URIs) to the physical location - of a concept (an URL). The \texttt{urimanager} \component{} provides the URI - data type and several utility functions over URIs. The - \texttt{cic\_proof\_checking} \component{} calls the \GETTER{} - \component{} every time it needs to retrieve the definition of a mathematical - concept referenced by a term that is being type-checked. - - The Proof Checker application is the Web service that provides an interface - to the \texttt{cic\_proof\_checking} \component. - - We use metadata to index the mathematical concepts - in the distributed library. We are interested in retrieving a concept - by matching, instantiation or generalization of a user or system provided - mathematical formula. Thus we need to collect metadata over the fully - specified terms and to store the metadata in some kind of (relational) - database for later usage. The \texttt{hmysql} \component{} provides - a simplified - interface to a (possibly remote) MySQL\footnote{\url{http://www.mysql.com/}} - database system used to store the metadata. - The \texttt{metadata} \component{} defines the data type of the metadata - we are collecting and the functions that extracts the metadata from the - mathematical concepts (the main functionality of the crawler). - The \texttt{whelp} \component{} implements a search engine that performs - approximated queries by matching/instantiation/generalization. The queries - operate only on the metadata and do not involve any actual matching - (see the \texttt{cic\_unification} \component{} in - Sect.~\ref{sec:partiallyintro}). Not performing any actual matching - a query only returns a complete and hopefully small set of matching - candidates. The process that has issued the query is responsible of - actually retrieving from the distributed library the candidates to prune - out false matches if interested in doing so. - - The \WHELP{} application is the Web service that provides an interface to - the \texttt{whelp} \component. - - According to our vision, the library is developed collaboratively so that - changing or removing a concept can invalidate other concepts in the library. - Moreover, changing or removing a concept requires a corresponding change - in the metadata database. The \texttt{library} \component{} is responsible - of preserving the coherence of the library and the database. For instance, - when a concept is removed, all the concepts that depend on it and their - metadata are removed from the library. This aspect will be better detailed - in Sect.~\ref{sec:libmanagement}. - -\subsection{Partially specified terms} -\label{sec:partiallyintro} - -\emph{Partially specified terms} are CIC terms where subterms can be omitted. -Omitted subterms come in two flavours: -\emph{implicit terms} that do not require a declaration, but can only occur -linearly; and \emph{metavariables}~\cite{geuvers-jojgov,munoz} whose declaration -associates with them a sequent and whose occurrences are coupled with an -explicit substitution. -A metavariable stands for a term whose type is -given by the conclusion of the sequent. The term must be closed in the -context that is given by the ordered list of hypotheses of the sequent. -The explicit substitution instantiates every hypothesis with an actual -value for the variable bound by the hypothesis. - -Partially specified terms are not required to be well-typed. However a -partially specified term should be \emph{refinable}. A \emph{refiner} is -a type-inference procedure that can instantiate implicit terms and -metavariables and that can introduce -\emph{implicit coercions}~\cite{barthe95implicit} to make a -partially specified term well-typed. The refiner of \MATITA{} is implemented -in the \texttt{cic\_unification} \component. In the same way as the -type checker is based on -convertibility, the refiner is based on \emph{unification}~\cite{strecker} -that is a procedure that makes two partially specified term convertible by -instantiating as few metavariables as possible that occur in them. - -Since terms are used in CIC to represent proofs, correct incomplete -proofs are represented by refinable partially specified terms. The metavariables -that occur in the proof correspond to the conjectures still to be proved. -The sequent associated to the metavariable is the conjecture the user needs to -prove. - -\emph{Tactics} are the procedures that the user can apply to progress in the -proof. A tactic proves a conjecture possibly creating new (and hopefully -simpler) conjectures. The implementation of tactics is given in the -\texttt{tactics} \component. It is heavily based on the refinement and -unification procedures of the \texttt{cic\_unification} \component. - -The \texttt{grafite} \component{} defines the abstract syntax tree (AST) for the -commands of the \MATITA{} proof assistant. Most of the commands are tactics. -Other commands are used to give definitions and axioms or to state theorems -and lemmas. The \texttt{grafite\_engine} \component{} is the core of \MATITA. -It implements the semantics of each command in the grafite AST as a function -from status to status. It implements also an undo function to go back to -previous statuses. - -As fully specified terms, partially specified terms are not well suited -for user consumption since their syntax is not extendible and it is not -possible to adopt the usual mathematical notation. However they are already -an improvement over fully specified terms since they allow to omit redundant -information that can be inferred by the refiner. - -\subsection{Content level terms} -\label{sec:contentintro} - -The language used to communicate proofs and especially formulae with the -user does not only need to be extendible and accommodate the usual mathematical -notation. It must also reflect the comfortable and suggestive degree of -notational abuse so typical of the mathematical language. - -For instance, it is common practice in mathematics to speak of a generic -equality that can be used to compare any two terms. However, it is well known -that several equalities can be distinguished as soon as we care for decidability -or for their computational properties. For instance equality over real -numbers is well known to be undecidable, whereas it is decidable over -rational numbers. - -Similarly, we usually speak of natural numbers and their operations and -properties without caring about their representation. However the computational -properties of addition over the binary representation are very different from -those of addition over the unary representation. And addition over two natural -numbers is definitely different from addition over two real numbers. - -Formalized mathematics cannot hide these differences and forces the user to be -very precise on the types he is using and their representation. However, -to communicate formulae with the user and with external tools, it seems good -practice to stick to the usual imprecise mathematical ontology. In the -Mathematical Knowledge Management community this imprecise language is called -the \emph{content level}~\cite{adams} representation of formulae. - -In \MATITA{} we provide translations from partially specified terms -to content level terms and the other way around. -The former translation must -discriminate between terms used to represent proofs and terms used to represent -formulae. Using techniques inspired by~\cite{natural,YANNTHESIS}, the firsts -are translated to a content level representation of -proof steps that can in turn easily be rendered in natural language. -The representation -adopted has greatly influenced the OMDoc~\cite{omdoc} proof format that is now -isomorphic to it. Terms that represent formulae are translated to \MATHML{} -Content formulae. \MATHML{} Content~\cite{mathml} is a W3C standard -for the representation of content level formulae in an extensible XML format. - -The translation to content level is implemented in the -\texttt{acic\_content} \component. Its input are \emph{annotated partially -specified terms}, that are -partially specified terms enriched with additional typing information for each -subterm. This information is used to discriminate between terms that represent -proofs and terms that represent formulae. Part of it is also stored at the -content level since it is required to generate the natural language rendering -of proofs. -The \texttt{cic\_acic} \component{} annotates terms. It is used -by the \texttt{library} \component{} since fully specified terms are stored -in the library in their annotated form. - -We do not provide yet a reverse translation from content level proofs to -partially specified terms. But in \texttt{cic\_disambiguation} we do provide -the reverse translation for formulae. The mapping from -content level formulae to partially specified terms is not unique due to -the ambiguity of the content level. As a consequence the translation -is guided by an \emph{interpretation}, that is a function that chooses for -every ambiguous formula one partially specified term. The -\texttt{cic\_disambiguation} \component{} implements the -disambiguation algorithm presented in~\cite{disambiguation} that is -responsible of building in an efficient way the set of all correct -interpretations. An interpretation is correct if the partially specified term -obtained using the interpretation is refinable. - -In Sect.~\ref{sec:partiallyintro} we described the semantics of -a command as a -function from status to status. We also hinted that the formulae in a -command are encoded as partially specified terms. However, consider the -command ``\texttt{replace} $x$ \texttt{with} $y^2$''. Until the occurrence -of $x$ to be replaced is located, its context is unknown. Since $y^2$ must -replace $x$ in that context, its encoding as a term cannot be computed -until $x$ is located. In other words, $y^2$ must be disambiguated in the -context of the occurrence $x$ it must replace. - -The elegant solution we have implemented consists in representing terms -in a command as functions from a context to a partially refined term. The -function is obtained by partially applying our disambiguation function to -the content level term to be disambiguated. Our solution should be compared with -the one adopted in the \COQ{} system (where ambiguity is only relative to -De Bruijn indexes). -In \COQ, variables can be bound either by name or by position. A term -occurring in a command has all its variables bound by name to avoid the need of -a context during disambiguation. This makes more complex every -operation over terms (i.e. according to our architecture every module that -depends on \texttt{cic}) since the code must deal consistently with both kinds -of binding. Moreover, this solution cannot cope with other forms of ambiguity -(as the context dependent meaning of the exponent in the previous example). - -\subsection{Presentation level terms} -\label{sec:presentationintro} - -Content level terms are a sort of abstract syntax trees for mathematical -formulae and proofs. The concrete syntax given to these abstract trees -is called \emph{presentation level}. - -The main important difference between the content level language and the -presentation level language is that only the former is extendible. Indeed, -the presentation level language is a finite language that comprises all -the usual mathematical symbols. Mathematicians invent new notions every -single day, but they stick to a set of symbols that is more or less fixed. - -The fact that the presentation language is finite allows the definition of -standard languages. In particular, for formulae we have adopt \MATHML{} -Presentation~\cite{mathml} that is an XML dialect standardized by the W3C. To -visually -represent proofs it is enough to embed formulae in plain text enriched with -formatting boxes. Since the language of formatting boxes is very simple, -many equivalent specifications exist and we have adopted our own, called -\BOXML. - -The \texttt{content\_pres} \component{} contains the implementation of the -translation from content level terms to presentation level terms. The -rendering of presentation level terms is left to the application that uses -the \component. However, in the \texttt{hgdome} \component{} we provide a few -utility functions to build a \GDOME~\cite{gdome2} \MATHML+\BOXML{} tree from our -presentation -level terms. \GDOME{} \MATHML+\BOXML{} trees can be rendered by the -\GTKMATHVIEW{} -widget developed by Luca Padovani~\cite{padovani}. The widget is -particularly interesting since it allows the implementation of \emph{semantic -selection}. - -Semantic selection is a technique that consists in enriching the presentation -level terms with pointers to the content level terms and to the partially -specified terms they correspond to. Highlight of formulae in the widget is -constrained to selection of meaningful expressions, i.e. expressions that -correspond to a lower level term, that is a content term or a partially or -fully specified term. -Once the rendering of a lower level term is -selected it is possible for the application to retrieve the pointer to the -lower level term. An example of applications of semantic selection is -\emph{semantic copy \& paste}: the user can select an expression and paste it -elsewhere preserving its semantics (i.e. the partially specified term), -possibly performing some semantic transformation over it (e.g. renaming -variables that would be captured or lambda-lifting free variables). - -The reverse translation from presentation level terms to content level terms -is implemented by a parser that is also found in \texttt{content\_pres}. -Differently from the translation from content level terms to partially -refined terms, this translation is not ambiguous. The reason is that the -parsing tool we have adopted (CamlP4) is not able to parse ambiguous -grammars. Thus we require the mapping from presentation level terms -(concrete syntax) to content level terms (abstract syntax) to be unique. -This means that the user must fix once and for all the associativity and -precedence level of every operator he is using. In practice this limitation -does not seem too strong. The reason is that the target of the -translation is an ambiguous language and the user is free to associate -to every content level term several different interpretations (as a -partially specified term). - -Both the direct and reverse translation from presentation to content level -terms are parameterized over the user provided mathematical notation. -The \texttt{lexicon} \component{} is responsible of managing the lexicon, -that is the set of active notations. It defines an abstract syntax tree -of commands to declare and activate new notations and it implements the -semantics of these commands. It also implements undoing of the semantic -actions. Among the commands there are hints to the -disambiguation algorithm that are used to control and speed up disambiguation. -These mechanisms will be further discussed in Sect.~\ref{sec:disambiguation}. - -Finally, the \texttt{grafite\_parser} \component{} implements a parser for -the concrete syntax of the commands of \MATITA. The parser processes a stream -of characters and returns a stream of abstract syntax trees (the ones -defined by the \texttt{grafite} component and whose semantics is given -by \texttt{grafite\_engine}). When the parser meets a command that changes -the lexicon, it invokes the \texttt{lexicon} \component{} to immediately -process the command. When the parser needs to parse a term at the presentation -level, it invokes the already described parser for terms contained in -\texttt{content\_pres}. - -The \MATITA{} proof assistant and the \WHELP{} search engine are both linked -against the \texttt{grafite\_parser} \components{} -since they provide an interface to the user. In both cases the formulae -written by the user are parsed using the \texttt{content\_pres} \component{} and -then disambiguated using the \texttt{cic\_disambiguation} \component. However, -only \MATITA{} is linked against the \texttt{grafite\_engine} and -\texttt{tactics} components (summing up to a total of 11'200 lines of code) -since \WHELP{} can only execute those ASTs that correspond to queries -(implemented in the \texttt{whelp} component). - -The \UWOBO{} Web service wraps the \texttt{content\_pres} \component, -providing a rendering service for the documents in the distributed library. -To render a document given its URI, \UWOBO{} retrieves it using the -\GETTER{} obtaining a document with fully specified terms. Then it translates -it to the presentation level passing through the content level. Finally -it returns the result document to be rendered by the user's -browser. - -The \components{} not yet described (\texttt{extlib}, \texttt{xml}, -\texttt{logger}, \texttt{registry} and \texttt{utf8\_macros}) are -minor \components{} that provide a core of useful functions and basic -services missing from the standard library of the programming language. -%In particular, the \texttt{xml} \component{} is used to easily represent, -%parse and pretty-print XML files. - -\section{The interface to the library} -\label{sec:library} - -A proof assistant provides both an interface to interact with its library and -an authoring interface to develop new proofs and theories. According -to its historical origins, \MATITA{} strives to provide innovative -functionalities for the interaction with the library. It is more traditional -in its script based authoring interface. In the remaining part of the paper we -focus on the user view of \MATITA. - -The library of \MATITA{} comprises mathematical concepts (theorems, -axioms, definitions) and notation. The concepts are authored sequentially -using scripts that are (ordered) sequences of procedural commands. -Once they are produced we store them independently in the library. -The only relation implicitly kept between the concepts are the logical, -acyclic dependencies among them. This way the library forms a global (and -distributed) hypertext. - -\begin{figure}[!ht] - \begin{center} - \includegraphics[width=0.45\textwidth]{pics/cicbrowser-screenshot-browsing} - \hspace{0.05\textwidth} - \includegraphics[width=0.45\textwidth]{pics/cicbrowser-screenshot-query} - \caption{Browsing and searching the library\strut} - \label{fig:cicbrowser1} - \end{center} -\end{figure} - -\begin{figure}[!ht] - \begin{center} - \includegraphics[width=0.70\textwidth]{pics/cicbrowser-screenshot-con} - \caption[Natural language rendering]{Natural language rendering of a theorem - from the library\strut} - \label{fig:cicbrowser2} - \end{center} -\end{figure} - -Several useful operations can be implemented on the library only, -regardless of the scripts. For instance, searching and browsing is -implemented by the ``cicBrowser'' window available from the \MATITA{} -GUI. Using it, the hierarchical structure of the library can be -explored (on the left of Fig.~\ref{fig:cicbrowser1}), the natural -language rendering of proofs can be inspected -(Fig.~\ref{fig:cicbrowser2}), and content based searches on the -library can be performed (on the right of Fig.~\ref{fig:cicbrowser1}). -Content based searches are described in -Sect.~\ref{sec:indexing}. Other examples of library operations are -disambiguation of content level terms (see -Sect.~\ref{sec:disambiguation}) and automatic proof searching (see -Sect.~\ref{sec:automation}). - -The key requisite for the previous operations is that the library must -be fully accessible and in a logically consistent state. To preserve -consistency, a concept cannot be altered or removed unless the part of the -library that depends on it is modified accordingly. To allow incremental -changes and cooperative development, consistent revisions are necessary. -For instance, to modify a definition, the user could fork a new version -of the library where the definition is updated and all the concepts that -used to rely on it are absent. The user is then responsible to restore -the removed part in the new branch, merging the branch when the library is -fully restored. - -To implement the proposed versioning system on top of a standard one -it is necessary to implement \emph{invalidation} first. Invalidation -is the operation that locates and removes from the library all the concepts -that depend on a given one. As described in Sect.~\ref{sec:libmanagement} removing -a concept from the library also involves deleting its metadata from the -database. - -For non collaborative development, full versioning can be avoided, but -invalidation is still required. Since nobody else is relying on the -user development, the user is free to change and invalidate part of the library -without branching. Invalidation is still necessary to avoid using a -concept that is no longer valid. -So far, in \MATITA{} we address only this non collaborative scenario -(see Sect.~\ref{sec:libmanagement}). Collaborative development and versioning -is still under design. - -Scripts are not seen as constituents of the library. They are not published -and indexed, so they cannot be searched or browsed using \HELM{} tools. -However, they play a central role for the maintenance of the library. -Indeed, once a concept is invalidated, the only way to restore it is to -fix the possibly broken script that used to generate it. -Moreover, during the authoring phase, scripts are a natural way to -group concepts together. They also constitute a less fine grained clustering -of concepts for invalidation. - -In the rest of this section we present in more details the functionalities of -\MATITA{} related to library management and exploitation. -Sect.~\ref{sec:authoring} is devoted to the description of the peculiarities of -the \MATITA{} authoring interface. - -\subsection{Indexing and searching} -\label{sec:indexing} - -The \MATITA{} system is first of all an interface between the user and -the mathematical library. For this reason, it is important to be -able to search and retrieve mathematical concepts in a quick and -effective way, assuming as little knowledge as possible about the -library. To this aim, \MATITA{} uses a sophisticated indexing mechanism -for mathematical concepts, based on a rich metadata set that has been -tuned along the European project \MOWGLIIST{} \MOWGLI. The metadata -set, and the searching facilities built on top of them --- collected -in the so called \WHELP{} search engine --- have been -extensively described in~\cite{whelp}. Let us just recall here that -the \WHELP{} metadata model is essentially based on a single ternary relation -\REF{p}{s}{t} stating that a concept $s$ refers a concept $t$ at a -given position $p$, where the position specify the place of the -occurrence of $t$ inside $s$ (we currently work with a fixed set of -positions, discriminating the hypothesis from the conclusion and -outermost form innermost occurrences). This approach is extremely -flexible, since extending the set of positions -we may improve the granularity and the precision of our indexing technique, -with no additional architectural impact. - -Every time a new mathematical concept is created and saved by the user it gets -indexed, and becomes immediately visible in the library. Several -interesting and innovative features of \MATITA{} described in the following -sections rely in a direct or indirect way on its metadata system and -the search functionalities. Here, we shall just recall some of its most -direct applications. - -A first, very simple but not negligible feature is the \emph{duplicate check}. -As soon as a theorem is stated, just before starting its proof, -the library is searched -to check that no other equivalent statement has been already proved -(based on the pattern matching functionality of \WHELP); if this is the case, -a warning is raised to the user. At present, the notion of equivalence -adopted by \MATITA{} is convertibility, but we may imagine to weaken it -in the future, covering for instance isomorphisms~\cite{delahaye}. - -Another useful \WHELP{} operation is \HINT; we may invoke this query -at any moment during the authoring of a proof, resulting in the list -of all theorems of the library which can be applied to the current -goal. In practice, this is mostly used not really to discover what theorems -can be applied to a given goal, but to actually retrieve a theorem that -we wish to apply, but whose name we have forgotten. -In fact, even if \MATITA{} adopts a semi-rigid naming convention for -statements (see Sect.~\ref{sec:naming}) that greatly simplifies the effort -of recalling names, the naming discipline remains one of the most -annoying aspects of formal developments, and \HINT{} provides -a very friendly solution. - -In the near future, we expect to extend the \HINT{} query to -a \REWRITEHINT, resulting in all equational statements that -can be applied to rewrite the current goal. - -\subsection{Disambiguation} -\label{sec:disambiguation} - -Software applications that involve input of mathematical content should strive -to require the user as less drift from informal mathematics as possible. We -believe this to be a fundamental aspect of such applications user interfaces. -The drift is still very large for -proofs~\cite{debrujinfactor}, while better results may be achieved for -mathematical formulae. In \MATITA{} formulae can be written using a -\TeX-like syntax (corresponding to presentation level terms) and are then -translated (in multiple steps) to partially specified terms as sketched in -Sect.~\ref{sec:contentintro}. - -The key ingredient of the translation is the generic disambiguation algorithm -implemented in the \texttt{disambiguation} component of Fig.~\ref{fig:libraries} -and presented in~\cite{disambiguation}. In this section we detail how to use -that algorithm in the context of the development of a library of formalized -mathematics. We will see that using multiple passes of the algorithm, varying -some of its parameters, helps in keeping the input terse without sacrificing -expressiveness. - -\subsubsection{Disambiguation preferences} -\label{sec:disambaliases} - -Consider the following command that states a theorem over integer numbers: - -\begin{grafite} -theorem Zlt_compat: - \forall x,y,z. x < y \to y < z \to x < z. -\end{grafite} - -The symbol \OP{<} is likely to be overloaded in the library -(at least over natural numbers). -Thus, according to the disambiguation algorithm, two different -refinable partially specified terms could be associated to it. -\MATITA{} asks the user what interpretation he meant. However, to avoid -posing the same question in case of a future re-execution (e.g. undo/redo), -the choice must be recorded. Since scripts need to be re-executed after -invalidation, the choice record must be permanently stored somewhere. The most -natural place is the script itself. - -In \MATITA{} disambiguation is governed by \emph{disambiguation aliases}. -They are mappings, stored in the library, from ambiguity sources -(identifiers, symbols and literal numbers at the content level) to partially -specified terms. In case of overloaded sources there exists multiple aliases -with the same source. It is possible to record \emph{disambiguation -preferences} to select one of the aliases of an overloaded source. - -Preferences can be explicitly given in the script (using the \texttt{alias} -command), but -are also implicitly added when a new concept is introduced (\emph{implicit -preferences}) or after a successful disambiguation that did not require -user interaction. Explicit preferences are added automatically by \MATITA{} to -record the disambiguation choices of the user. For instance, after the -disambiguation of the command above, the script is altered as follows: - -\begin{grafite} -alias symbol "lt" = "integer 'less than'". -theorem Zlt_compat: - \forall x,y,z. x < y \to y < z \to x < z. -\end{grafite} - -The ``alias'' command in the example sets the preferred alias for the -\OP{lt} symbol. - -Implicit preferences for new concepts are set since a concept just defined is -likely to be the preferred one in the rest of the script. Implicit preferences -learned from disambiguation of previous commands grant the coherence of -the disambiguation in the rest of the script and speed up disambiguation -reducing the search space. - -Disambiguation preferences are included in the lexicon status -(see Sect.~\ref{sec:presentationintro}) that is part of the authoring interface -status. Unlike aliases, they are not part of the library. - -When starting a new authoring session the set of disambiguation preferences -is empty. Until it contains a preference for each overloaded symbol to be -used in the script, the user can be faced with questions from the disambiguator. -To reduce the likelihood of user interactions, we introduced -the \texttt{include} command. With \texttt{include} it is possible to import -at once in the current session the set of preferences that was in effect -at the end of the execution of a given script. -The inclusion mechanism is thus sensibly different from that of other systems -where concepts are effectively loaded and made visibible by inclusion; in \MATITA{} -all concepts are always visible, and inclusion, that is optional, is only used -to set up preferences. - -Preferences can be changed. For instance, at the beginning of the development -of integer numbers the preference for the symbol \OP{<} is likely -to be the one over natural numbers; sooner or later it will be set to the one -over integer numbers. - -Nothing forbids the set of preferences to become incoherent. For this reason -the disambiguator cannot always respect the user preferences. -Consider, for example: -\begin{grafite} -theorem Zlt_mono: - \forall x,y,k. x < y \to x < y + k. -\end{grafite} - -No refinable partially specified term corresponds to the preferences: -\OP{+} over natural numbers, \OP{<} over integer numbers. To overcome this -limitation we organized disambiguation in \emph{multiple passes}: when the -disambiguator fails, disambiguation is tried again with a less restrictive set of -preferences. - -Several disambiguation parameters can vary among passes. With respect to -preference handling we implemented three passes. In the first pass, called -\emph{mono-preferences}, we consider only the aliases corresponding to the -current set of preferences. In the second pass, called -\emph{multi-preferences}, we -consider every alias corresponding to a current or past preference. For -instance, in the example above disambiguation succeeds in the multi-preference -pass. In the third pass, called \emph{library-preferences}, all aliases -available in the library are considered. - -The rationale behind this choice is trying to respect user preferences in early -passes that complete quickly in case of failure; later passes are slower but -have more chances of success. - -\subsubsection{Operator instances} -\label{sec:disambinstances} - -Consider now the following theorem, where \texttt{pos} injects natural numbers -into positive integers: - -\begin{grafite} -theorem lt_to_Zlt_pos_pos: - \forall n,m: nat. n < m \to pos n < pos m. -\end{grafite} -and assume that there exist in the library aliases for \OP{<} over natural -numbers and over integer numbers. None of the passes described above is able to -disambiguate \texttt{lt\_to\_Zlt\_pos\_pos}, no matter how preferences are set. -This is because the \OP{<} operator occurs twice in the content level term (it -has two \emph{instances}) and two different interpretations for it have to be -used in order to obtain a refinable partially specified term. - -To address this issue, we have the ability to consider each instance of a single -symbol as a different ambiguous expression in the content level term, -enabling the use of a different alias for each of them. -Exploiting or not this feature is -one of the disambiguation pass parameters. A disambiguation pass which exploit -it is said to be using \emph{fresh instances} (opposed to a \emph{shared -instances} pass). - -Fresh instances lead to a non negligible performance loss (since the choice of -an alias for one instance does not constraint the choice of the others). For -this reason we always attempt a fresh instances pass only after attempting a -shared instances pass. - -%\paragraph{One-shot preferences} Disambiguation preferences as seen so far are -%instance-independent. However, implicit preferences obtained as a result of a -%disambiguation pass which uses fresh instances ought to be instance-dependent. -%Informally, the set of preferences that can be respected by the disambiguator on -%the theorem above is: ``the first instance of the \OP{<} symbol is over natural -%numbers, while the second is on integer numbers''. - -Instance-dependent preferences are meaningful only for the term whose -disambiguation generated them. For this reason we call them \emph{one-shot -preferences} and \MATITA{} does not use them to disambiguate further terms in -the script. - -\subsubsection{Implicit coercions} -\label{sec:disambcoercions} - -Consider the following theorem about derivation: -\begin{grafite} -theorem power_deriv: - \forall n: nat, x: R. d x^n dx = n * x^(n - 1). -\end{grafite} -and assume that in the library there is an alias mapping \OP{\^} to a partially -specified term having type: \texttt{R \TEXMACRO{to} nat \TEXMACRO{to} R}. In -order to disambiguate \texttt{power\_deriv}, the occurrence of \texttt{n} on the -right hand side of the equality need to be ``injected'' from \texttt{nat} to -\texttt{R}. The refiner of \MATITA{} supports -\emph{implicit coercions}~\cite{barthe95implicit} for -this reason: given as input the above presentation level term, it will return a -partially specified term where in place of \texttt{n} the application of a -coercion from \texttt{nat} to \texttt{R} appears (assuming such a coercion has -been defined in advance). - -Implicit coercions are not always desirable. For example, consider the term -\texttt{\TEXMACRO{forall} x. x < x + 1} and assume that the preferences for \OP{<} -and \OP{+} are over real numbers. The expected interpretation assignes the -type \texttt{R} to \texttt{x}. -However, if we had a coercion from natural to real numbers an alternative -interpretation is to assign the type \texttt{nat} to \texttt{x} inserting the coercion -as needed. Clearly, the latter interpretation looks artificial and -for this reason we enable coercions only in case of failure of previous -attempts. - -The choice of whether implicit coercions are enabled or not interacts with the -choice about operator instances. Indeed, consider again -\texttt{lt\_to\_Zlt\_pos\_pos}, which can be disambiguated using fresh operator -instances. In case there exists a coercion from natural numbers to positive -integers ({\texttt{pos} itself), the command -can be disambiguated as -\texttt{\TEXMACRO{forall} n,m: nat. pos n < pos m \TEXMACRO{to} pos n < pos m}. -This is not the expected interpretation; -by this and similar examples we choose to always prefer fresh -instances over implicit coercions. - -\subsubsection{Disambiguation passes} -\label{sec:disambpasses} - -According to the criteria described above, in \MATITA{} we perform the -disambiguation passes depicted in Tab.~\ref{tab:disambpasses}. In -our experience that choice gives reasonable performance and reduces the need -of user interaction during the disambiguation. - -\begin{table}[ht] - \begin{center} - \begin{tabular}{c|c|c|c} - \multicolumn{1}{p{1.5cm}|}{\centering\raisebox{-1.5ex}{\textbf{Pass}}} - & \multicolumn{1}{p{3.1cm}|}{\centering\raisebox{-1.5ex}{\textbf{Preferences}}} - & \multicolumn{1}{p{2.5cm}|}{\centering\textbf{Operator instances}} - & \multicolumn{1}{p{2.5cm}}{\centering\textbf{Implicit coercions}} \\ - \hline - \PASS & Mono-preferences & Shared instances & Disabled \\ - \PASS & Multi-preferences & Shared instances & Disabled \\ - \PASS & Mono-preferences & Fresh instances & Disabled \\ - \PASS & Multi-preferences & Fresh instances & Disabled \\ - \PASS & Mono-preferences & Fresh instances & Enabled \\ - \PASS & Multi-preferences & Fresh instances & Enabled \\ - \PASS & Library-preferences & Fresh instances & Enabled - \end{tabular} - \end{center} - \caption{Disambiguation passes sequence\strut} - \label{tab:disambpasses} -\end{table} - -\subsection{Invalidation and regeneration} -\label{sec:libmanagement} - -%The aim of this section is to describe the way \MATITA{} -%preserves the consistency and the availability of the library -%using the \WHELP{} technology, in response to the user alteration or -%removal of mathematical objects. -% -%As already sketched in Sect.~\ref{sec:fullyintro} what we generate -%from a script is split among two storage media, a -%classical filesystem and a relational database. The former is used to -%store the XML encoding of the objects defined in the script, the -%disambiguation aliases and the interpretation and notational convention defined, -%while the latter is used to store all the metadata needed by -%\WHELP. -% -%While the consistency of the data store in the two media has -%nothing to do with the nature of -%the content of the library and is thus uninteresting (but really -%tedious to implement and keep bug-free), there is a deeper -%notion of mathematical consistency we need to provide. Each object -%must reference only defined object (i.e. each proof must use only -%already proved theorems). - -In this section we will focus on how \MATITA{} ensures the library -consistency during the formalization of a mathematical theory, -giving the user the freedom of adding, removing, modifying objects -without loosing the feeling of an always visible and browsable -library. - -\subsubsection{Invalidation} - -Invalidation (see Sect.~\ref{sec:library}) is implemented in two phases. - -The first one is the computation of all the concepts that recursively -depend on the ones we are invalidating. It can be performed -from the metadata stored in the relational database. -This technique is the same used by the \emph{Dependency Analyzer} -and is described in~\cite{zack-master}. - -The second phase is the removal of all the results of the generation, -metadata included. - -\subsubsection{Regeneration} - -%The typechecker component guarantees that if an object is well typed -%it depends only on well typed objects available in the library, -%that is exactly what we need to be sure that the logic consistency of -%the library is preserved. - -To regenerate an invalidated part of the library \MATITA{} re-executes -the scripts that produced the invalidated concepts. The main -problem is to find a suitable order of execution of the scripts. - -For this purpose we provide a tool called \MATITADEP{} -that takes in input the list of scripts that compose the development and -outputs their dependencies in a format suitable for the GNU \texttt{make} -tool.\footnote{\url{http://www.gnu.org/software/make/}} -The user is not asked to run \MATITADEP{} by hand, but -simply to tell \MATITA{} the root directory of his development (where all -script files can be found) and \MATITA{} will handle all the generation -related tasks, including dependencies calculation. - -To compute dependencies it is enough to look at the script files for -literal of included explicit disambiguation preferences -(see Sect.~\ref{sec:disambaliases}). - -The re-execution of a script to regenerate part of the library -requires the preliminary invalidation of the concepts generated by the -script. - -\subsubsection{Batch vs Interactive} - -\MATITA{} includes an interactive authoring interface and a batch -``compiler'' (\MATITAC). - -Only the former is intended to be used directly by the -user, the latter is automatically invoked by \MATITA{} -to regenerate parts of the library previously invalidated. - -While they share the same engine for generation and invalidation, they -provide different granularity. \MATITAC{} is only able to re-execute a -whole script and similarly to invalidate all the concepts generated -by a script (together with all the other scripts that rely on a concept defined -in it). - -\subsection{Automation} -\label{sec:automation} - -In the long run, one would expect to work with a proof assistant -like \MATITA, using only three basic tactics: \TAC{intro}, \TAC{elim}, -and \TAC{auto} -(possibly integrated by a moderate use of \TAC{cut}). The state of the art -in automated deduction is still far away from this goal, but -this is one of the main development direction of \MATITA. - -Even in this field, the underlying philosophy of \MATITA{} is to -free the user from any burden relative to the overall management -of the library. For instance, in \COQ, the user is responsible to -define small collections of theorems to be used as a parameter -by the \TAC{auto} tactic; -in \MATITA, it is the system itself that automatically retrieves, from -the whole library, a subset of theorems worth to be considered -according to the signature of the current goal and context. - -The basic tactic merely iterates the use of the \TAC{apply} tactic -(with no \TAC{intro}). The search tree may be pruned according to two -main parameters: the \emph{depth} (whit the obvious meaning), and the -\emph{width} that is the maximum number of (new) open goals allowed at -any instant. \MATITA{} has only one notion of metavariable, corresponding -to the so called existential variables of \COQ; so, \MATITA's \TAC{auto} -tactic should be compared with \COQ's \TAC{EAuto} tactic. - -Recently we have extended automation with paramodulation based -techniques. At present, the system works reasonably well with -equational rewriting, where the notion of equality is parametric -and can be specified by the user: the system only requires -a proof of {\em reflexivity} and {\em paramodulation} (or rewriting, -as it is usually called in the proof assistant community). - -Given an equational goal, \MATITA{} recovers all known equational facts -from the library (and the local context), applying a variant of -the so called {\em given-clause algorithm}~\cite{paramodulation}, -that is the the procedure currently used by the majority of modern -automatic theorem provers. - -The given-clause algorithm is essentially composed by an alternation -of a \emph{saturation} phase and a \emph{demodulation} phase. -The former derives new facts by a set of active -facts and a new \emph{given} clause suitably selected from a set of passive -equations. The latter tries to simplify the equations -orienting them according to a suitable weight associated to terms. -\MATITA{} currently supports several different weighting functions -comprising Knuth-Bendix ordering (kbo) and recursive path ordering (rpo), -that integrates particularly well with normalization. - -Demodulation alone is already a quite powerful technique, and -it has been turned into a tactic by itself: the \TAC{demodulate} -tactic, which can be seen as a kind of generalization of \TAC{simplify}. -The following portion of script describes two -interesting cases of application of this tactic (both of them relying -on elementary arithmetic equations): - -\begin{grafite} -theorem example1: - \forall x: nat. (x+1)*(x-1) = x*x - 1. -intro. -apply (nat_case x); - [ simplify; reflexivity - | intro; demodulate; reflexivity ] -qed. -\end{grafite} - -\begin{grafite} -theorem example2: - \forall x,y: nat. (x+y)*(x+y) = x*x + 2*x*y + y*y. -intros; demodulate; reflexivity -qed. -\end{grafite} - -In the future we expect to integrate applicative and equational -rewriting. In particular, the overall idea would be to integrate -applicative rewriting with demodulation, treating saturation as an -operation to be performed in batch mode, e.g. during the night. - -\subsection{Naming convention} -\label{sec:naming} - -A minor but not entirely negligible aspect of \MATITA{} is that of -adopting a (semi)-rigid naming convention for concept names, derived by -our studies about metadata for statements. -The convention is only applied to theorems -(not definitions), and relates theorem names to their statements. -The basic rules are the following: -\begin{itemize} - - \item each name is composed by an ordered list of (short) - identifiers occurring in a left to right traversal of the statement; - - \item all names should (but this is not strictly compulsory) - separated by an underscore; - - \item names occurring in two different hypotheses, or in an hypothesis - and in the conclusion must be separated by the string \texttt{\_to\_}; - - \item the identifier may be followed by a numerical suffix, or a - single or double apostrophe. - -\end{itemize} - -Take for instance the statement: -\begin{grafite} -\forall n: nat. n = plus n O -\end{grafite} -Possible legal names are: \texttt{plus\_n\_O}, \texttt{plus\_O}, -\texttt{eq\_n\_plus\_n\_O} and so on. - -Similarly, consider the statement -\begin{grafite} -\forall n,m: nat. n < m to n \leq m -\end{grafite} -In this case \texttt{lt\_to\_le} is a legal name, -while \texttt{lt\_le} is not. - -But what about, say, the symmetric law of equality? Probably you would like -to name such a theorem with something explicitly recalling symmetry. -The correct approach, -in this case, is the following. You should start with defining the -symmetric property for relations: -\begin{grafite} -definition symmetric \def - \lambda A: Type. \lambda R. \forall x,y: A. - R x y \to R y x. -\end{grafite} -Then, you may state the symmetry of equality as: -\begin{grafite} -\forall A: Type. symmetric A (eq A) -\end{grafite} -and \texttt{symmetric\_eq} is a legal name for such a theorem. - -So, somehow unexpectedly, the introduction of semi-rigid naming convention -has an important beneficial effect on the global organization of the library, -forcing the user to define abstract concepts and properties before -using them (and formalizing such use). - -Two cases have a special treatment. The first one concerns theorems whose -conclusion is a (universally quantified) predicate variable, i.e. -theorems of the shape -$\forall P,\dots,.P(t)$. -In this case you may replace the conclusion with the string -\texttt{elim} or \texttt{case}. -For instance the name \texttt{nat\_elim2} is a legal name for the double -induction principle. - -The other special case is that of statements whose conclusion is a -match expression. -A typical example is the following: -\begin{grafite} -\forall n,m: nat. - match (eqb n m) with - [ true \Rightarrow n = m - | false \Rightarrow n \neq m ] -\end{grafite} -where \texttt{eqb} is boolean equality. -In this cases, the name can be build starting from the matched -expression and the suffix \texttt{\_to\_Prop}. In the above example, -\texttt{eqb\_to\_Prop} is accepted. - -\section{The authoring interface} -\label{sec:authoring} - -The authoring interface of \MATITA{} is very similar to Proof -General~\cite{proofgeneral}. We -chose not to build the \MATITA{} UI over Proof General for two reasons. First -of all we wanted to integrate in the UI our rendering technologies, mainly -\GTKMATHVIEW, to render sequents exploiting the bidimensional mathematical layouts -of \MATHML-Presentation. -At the time of writing Proof General supports only text based -rendering.\footnote{This may change with future releases of Proof General -based on Eclipse (\url{http://www.eclipse.org/}).} The second reason is that we wanted -to build the \MATITA{} UI on top of a state-of-the-art and widespread graphical -toolkit as \GTK{} is. - -Fig.~\ref{fig:screenshot} is a screenshot of the \MATITA{} authoring interface. -The foreground -window is an instance of the cicBrowser (see Sect.~\ref{sec:library}) used to -render in natural language the proof under development. - -Note that the syntax used in the script view is \TeX-like, but -Unicode\footnote{\url{http://www.unicode.org/}} is -also fully supported so that mathematical glyphs can be input as such. - -\begin{figure}[!ht] - \begin{center} - \includegraphics[width=0.95\textwidth]{pics/matita-screenshot} - \caption{Authoring interface\strut} - \label{fig:screenshot} - \end{center} -\end{figure} - -Since the concepts of script based proof authoring are well-known, the -remaining part of this section is devoted to the distinguishing -features of the \MATITA{} authoring interface. - -\subsection{Direct manipulation of terms} -\label{sec:directmanip} - -While terms are input as \TeX-like formulae in \MATITA, they are converted to a -mixed \MATHML+\BOXML{} markup for output purposes and then rendered by -\GTKMATHVIEW. As described in~\cite{latexmathml} this mixed choice enables both -high-quality bidimensional rendering of terms (including the use of fancy -layout schemata like radicals and matrices) and the use of a -concise and widespread textual syntax. - -Keeping pointers from the presentations level terms down to the -partially specified ones, \MATITA{} enables direct manipulation of -rendered (sub)terms in the form of hyperlinks and semantic selection. - -\emph{Hyperlinks} have anchors on the occurrences of constant and -inductive type constructors and point to the corresponding definitions -in the library. Anchors are available notwithstanding the use of -user-defined mathematical notation: as can be seen on the right of -Fig.~\ref{fig:directmanip}, where we clicked on $\nmid$, symbols -encoding complex notations retain all the hyperlinks of constants or -constructors used in the notation. - -\emph{Semantic selection} enables the selection of mixed -\MATHML+\BOXML{} markup, constraining the selection to markup -representing meaningful CIC (sub)terms. In the example on the left of -Fig.~\ref{fig:directmanip} is thus possible to select the subterm -$\mathrm{prime}~n$, whereas it would not be possible to select -$\to n$ since the former denotes an application while the -latter is not a subterm. Once a meaningful (sub)term has been -selected actions like reductions or tactic applications can be performed on it. - -\begin{figure}[!ht] - \begin{center} - \includegraphics[width=0.40\textwidth]{pics/matita-screenshot-selection} - \hspace{0.05\textwidth} - \raisebox{0.4cm}{\includegraphics[width=0.50\textwidth]{pics/matita-screenshot-href}} - \caption[Semantic selection and hyperlinks]{Semantic selection (on the left) - and hyperlinks (on the right)\strut} - \label{fig:directmanip} - \end{center} -\end{figure} - -\subsection{Patterns} -\label{sec:patterns} - -In several situations working with direct manipulation of terms is -simpler and faster than typing the corresponding textual -commands~\cite{proof-by-pointing}. -Nonetheless we need to record actions and selections in scripts. - -In \MATITA{} \emph{patterns} are textual representations of selections. -Users can select using the GUI and then ask the system to paste the -corresponding pattern in this script. More often this process is -transparent to the user: once an action is performed on a selection, -the corresponding -textual command is computed and inserted in the script. - -\subsubsection{Pattern syntax} - -Patterns are composed of two parts: \NT{sequent\_path} and -\NT{wanted}; their concrete syntax is reported in Tab.~\ref{tab:pathsyn}. - -\NT{sequent\_path} mocks-up a sequent, discharging unwanted subterms -with \OP{?} and selecting the interesting parts with the placeholder -\OP{\%}. \NT{wanted} is a term that lives in the context of the -placeholders. - -Textual patterns produced from a graphical selection are made of the -\NT{sequent\_path} only. Such patterns can represent every selection, -but are quite verbose. The \NT{wanted} part of the syntax is meant to -help the users in writing concise and elegant patterns by hand. - -\begin{table} - \caption{Patterns concrete syntax\strut} - \label{tab:pathsyn} -\hrule -\[ -\begin{array}{@{}rcll@{}} - \NT{pattern} & - ::= & [~\verb+in+~\NT{sequent\_path}~]~[~\verb+match+~\NT{wanted}~] & \\ - \NT{sequent\_path} & - ::= & \{~\NT{ident}~[~\verb+:+~\NT{multipath}~]~\}~ - [~\verb+\vdash+~\NT{multipath}~] & \\ - \NT{multipath} & ::= & \NT{term\_with\_placeholders} & \\ - \NT{wanted} & ::= & \NT{term} & \\ -\end{array} -\] -\hrule -\end{table} - -\subsubsection{Pattern evaluation} - -Patterns are evaluated in two phases. The first selects roots -(subterms) of the sequent, using the \NT{sequent\_path}, while the -second searches the \NT{wanted} term starting from that roots. -% Both are optional steps, and by convention the empty pattern selects -% the whole conclusion. - -\begin{description} -\item[Phase 1] - concerns only \NT{sequent\_path}. \NT{ident} is an hypothesis name and selects - the assumption where the following optional \NT{multipath} will operate. - \verb+\vdash+ can be considered the name for the goal. If the whole pattern - is omitted, the whole goal will be selected. If one or more hypothesis names - are given, the selection is restricted to that assumptions. If a - $\NT{multipath}$ is omitted the whole assumption is selected. Remember that - the user can be mostly unaware of patterns concrete syntax, since the system - is able to write down a \NT{sequent\_path} starting from a graphical - selection. - - A \NT{multipath} is a CIC term in which a special constant \OP{\%} is allowed. - The roots of discharged subterms are marked with \OP{?}, while \OP{\%} is used - to select roots. The default \NT{multipath}, the one that selects the whole - term, is simply \OP{\%}. Valid \NT{multipath} are, for example, \texttt{(? \% - ?)} or \texttt{\% \TEXMACRO{to} (\% ?)} that respectively select the first - argument of an application or the source of an arrow and the head of the - application that is found in the arrow target. - - This phase not only selects terms (roots of subterms) but determines also - their context that will be possibly used in the next phase. - -\item[Phase 2] - plays a role only if \NT{wanted} is specified. From the first phase we - have some terms, that we will use as roots, and their context. - For each of these contexts the \NT{wanted} term is disambiguated in it - and the corresponding root is searched for a subterm that can be unified to - \NT{wanted}. The result of this search is the selection the - pattern represents. - -\end{description} - -\subsubsection{Examples} - -Consider the following sequent: -\sequent{n: nat\\m: nat\\H: m + n = n}{m = O} - -To change the right part of the equality of the $H$ -hypothesis with $O + n$, the user selects and pastes it as the pattern -in the following statement. -\begin{grafite} - change in H:(? ? ? %) with (O + n). -\end{grafite} - -To understand the pattern (or produce it by hand) the user should be aware that -the notation $m + n = n$ hides the term $\mathrm{eq}~\mathrm{nat}~(m + n)~n$, so -that the pattern selects only the third argument of $\mathrm{eq}$. - -The experienced user may also write by hand a concise pattern to change at once -all the occurrences of $n$ in the hypothesis $H$: -\begin{grafite} - change in H match n with (O + n). -\end{grafite} - -In this case the \NT{sequent\_path} selects the whole $H$, while -the second phase locates $n$. - -The latter pattern is equivalent to the following one, that the system -can automatically generate from the selection. -\begin{grafite} - change in H:(? ? (? ? %) %) with (O + n). -\end{grafite} - -\subsubsection{Comparison with \COQ{}} - -In \MATITA{} all the tactics that act on subterms of the current sequent -accept pattern arguments. Additional arguments can be disambiguated in the -contexts of the terms selected by the pattern -(\emph{context-dependent arguments}). - -%\NOTE{attualmente rewrite e fold non supportano fase 2. per -%supportarlo bisogna far loro trasformare il pattern phase1+phase2 -%in un pattern phase1only come faccio nell'ultimo esempio. lo si fa -%con una pattern\_of(select(pattern))} - -\COQ{} has two different ways of restricting the application of tactics to -subterms of the current sequent, both relying on the same special syntax to -identify a term occurrence. - -The first way is to use this special syntax to tell the -tactic what occurrences of a wanted term should be affected. -The second is to prepare the sequent with another tactic called -\TAC{pattern} and then apply the real tactic. Note that the choice is not -left to the user, since some tactics needs the sequent to be prepared -with pattern and do not accept directly this special syntax. - -The idea is that to identify a subterm of the sequent we can -write it and say that we want, for example, its third and fifth -occurrences (counting from left to right). In our previous example, -to change only the left part of the equivalence, the correct command -would be: -\begin{grafite} - change n at 2 in H with (O + n) -\end{grafite} -meaning that in the hypothesis \texttt{H} the \texttt{n} we want to change is -the second we encounter proceeding from left to right. - -The tactic \TAC{pattern} computes a -$\beta$-expansion of a part of the sequent with respect to some -occurrences of the given term. In the previous example the following -command: -\begin{grafite} - pattern n at 2 in H -\end{grafite} -would have resulted in the sequent: -\sequent{n: nat\\m : nat\\H: (fun~n0: nat => m + n = n0)~n}{m = 0} -where \texttt{H} is $\beta$-expanded over the second \texttt{n} -occurrence. - -At this point, since \COQ{} unification algorithm is essentially first-order, -the application of an elimination principle (a term of type\linebreak -$\forall P.\forall -x.(H~x)\to (P~x)$) will unify \texttt{x} with \texttt{n} and \texttt{P} with -\texttt{(fun n0: nat => m + n = n0)}. - -Since \TAC{rewrite}, \TAC{replace} and several other tactics boil down to -the application of the equality elimination principle, the previous -trick implements the expected behavior. - -The idea behind this way of identifying subterms in not really far -from the idea behind patterns, but fails in extending to -complex notation, since it relies on a mono-dimensional sequent representation. -Real mathematical notation places arguments upside-down (like in indexed sums or -integrations) or even puts them inside a bidimensional matrix. -In these cases using the mouse to select the wanted term is probably the -more effective way to tell the system what to do. -One of the goals of \MATITA{} is to use modern publishing techniques, -so we prefer our method that does not discourage the use of complex layout schemata. - -In \MATITA{}, tactics accepting pattern arguments can be more expressive than -the equivalent tactics in \COQ{} since variables bound in the pattern context, -can occur in context-dependent arguments. For example, consider the sequent: -\sequent{n: nat\\x: nat\\H: \forall m. n + m*n = x + m}{m = O} -In \MATITA{} the user can issue the command: -\begin{grafite} -change in H: \forall _. (? ? % ?) with (S m) * n. -\end{grafite} -to change $n+m*n$ with $(S~m)*n$. To achieve the same effect in \COQ, the -user is forced to change the whole hypothesis rewriting its right hand side -as well. - -\subsection{Tacticals} -\label{sec:tinycals} - -The procedural proof language implemented in \MATITA{} is pretty standard, -with a notable exception for tacticals. - -Tacticals first appeared in LCF~\cite{lcf} as higher order tactics. -They can be seen as control flow constructs like looping, branching, -error recovery and sequential composition. - -The following simple example shows a \COQ{} script made of four dot-terminated -commands: -\begin{grafite} -Theorem trivial: - forall A B:Prop, - A = B -> ((A -> B) /\ (B -> A)). - intros [A B H]. - split; intro; - [ rewrite < H; assumption - | rewrite > H; assumption - ]. -Qed. -\end{grafite} - -The third command is an application of the sequencing tactical \OP{$\ldots$~;~$\ldots$}, -that combines the tactic \TAC{split} with the application of the branching -tactical \OP{$\ldots$~;[~$\ldots$~|~$\ldots$~|~$\ldots$~]} to other tactics or tacticals. - -The usual implementation of tacticals executes them atomically as any -other command. In \MATITA{} this is not the case: each punctuation -symbol is executed as a single command. - -\subsubsection{Common issues of tacticals} -We will examine the two main problems of procedural proof languages: -maintainability and readability. - -Tacticals are not only used to make scripts shorter by factoring out -common cases and repeating commands. They are the primary way of making -scripts more maintainable. They also have the well-known duty of -structuring the proof using the branching tactical. - -However, authoring a proof structured with tacticals is annoying. -Consider for example a proof by induction, and imagine you -are using one of the state of the art graphical interfaces for proof assistant -like Proof General. After applying the induction principle you have to choose: -immediately structure the proof or postpone the structuring. -If you decide for the former you have to apply the branching tactical and write -at once tactics for all the cases. Since the user does not even know the -generated goals yet, he can only replace all the cases with the identity -tactic and execute the command, just to receive feedback on the first -goal. Then he has to go one step back to replace the first identity -tactic with the wanted one and repeat the process until all the -branches are closed. - -One could imagine that a structured script is simpler to understand. -This is not the case. -A proof script, being not declarative, is not meant to be read. -However, the user has the need of explaining it to others. -This is achieved by interactively re-playing the script to show each -intermediate proof status. Tacticals make this operation uncomfortable. -Indeed, a tactical is executed atomically, while it is obvious that it -performs lot of smaller steps we are interested in. -To show the intermediate steps, the proof must be de-structured on the -fly, for example replacing \OP{;} with \OP{.} where possible. - -\MATITA{} has a peculiar tacticals implementation that provides the -same benefits as classical tacticals, while not burdening the user -during proof authoring and re-playing. - -\subsubsection{The \MATITA{} approach} - -\begin{table} - \caption{Concrete syntax of tacticals\strut} - \label{tab:tacsyn} -\hrule -\[ -\begin{array}{@{}rcll@{}} - \NT{punctuation} & - ::= & \SEMICOLON \quad|\quad \DOT \quad|\quad \SHIFT \quad|\quad \BRANCH \quad|\quad \MERGE \quad|\quad \POS{\mathrm{NUMBER}~} & \\ - \NT{block\_kind} & - ::= & \verb+focus+ ~|~ \verb+try+ ~|~ \verb+solve+ ~|~ \verb+first+ ~|~ \verb+repeat+ ~|~ \verb+do+~\mathrm{NUMBER} & \\ - \NT{block\_delim} & - ::= & \verb+begin+ ~|~ \verb+end+ & \\ - \NT{command} & - ::= & \verb+skip+ ~|~ \NT{tactic} ~|~ \NT{block\_delim} ~|~ \NT{block\_kind} ~|~ \NT{punctuation} \\ -\end{array} -\] -\hrule -\end{table} - -\MATITA{} tacticals syntax is reported in Tab.~\ref{tab:tacsyn}. -LCF tacticals have been replaced by unstructured more primitive commands; -every LCF tactical is semantically equivalent to a sequential composition of -them. As usual, each command is executed atomically, so that a sequence -corresponding to an LCF tactical is now executed in multiple steps. - -For instance, reconsider the previous example of a proof by induction. -In \MATITA{} the user can apply the induction principle, and just -open the branching punctuation symbol \OP{[}. Then he can interact with the -system (applying tactics and so forth) until he decides to move to the -next branch using \OP{|}. After the last branch, the punctuation symbol -\OP{]} must be used to collect goals possibly left open, accordingly to -the semantics of the LCF branching tactical \OP{$\ldots$~;[~$\ldots$~|~$\ldots$~|~$\ldots$~]}. The result effortlessly obtained is a structured script. - -The user is not forced to fully structure his script. If he wants, he -can even write completely unstructured proofs using only the \OP{.} -punctuation symbol. - -Re-playing a proof is also straightforward since there is no longer any need -to manually destructure the proof. - -\section{Standard library} -\label{sec:stdlib} - -\MATITA{} is \COQ{} compatible, in the sense that every theorem of \COQ{} can be -read, checked and referenced in further developments. However, in order to test -the actual usability of the system, a new library of results has been started -from scratch. In this case, of course, we wrote (and offer) the source scripts, -while in the case of \COQ{} \MATITA{} may only rely on XML files of \COQ{} -objects. - -The current library just comprises about one thousand theorems in -elementary aspects of arithmetics up to the multiplicative property for -Eulers' totient function $\phi$. - -The library is organized in five main directories: \texttt{logic} (connectives, -quantifiers, equality, \ldots), \texttt{datatypes} (basic datatypes and type -constructors), \texttt{nat} (natural numbers), \texttt{Z} (integers), \texttt{Q} -(rationals). The most complex development is \texttt{nat}, organized in 25 -scripts, listed in Tab.~\ref{tab:scripts}. - -\begin{table}[ht] - \begin{tabular}{lll} - \FILE{nat.ma} & \FILE{plus.ma} & \FILE{times.ma} \\ - \FILE{minus.ma} & \FILE{exp.ma} & \FILE{compare.ma} \\ - \FILE{orders.ma} & \FILE{le\_arith.ma} & \FILE{lt\_arith.ma} \\ - \FILE{factorial.ma} & \FILE{sigma\_and\_pi.ma} & \FILE{minimization.ma} \\ - \FILE{div\_and\_mod.ma} & \FILE{gcd.ma} & \FILE{congruence.ma} \\ - \FILE{primes.ma} & \FILE{nth\_prime.ma} & \FILE{ord.ma} \\ - \FILE{count.ma} & \FILE{relevant\_equations.ma} & \FILE{permutation.ma} \\ - \FILE{factorization.ma} & \FILE{chinese\_reminder.ma} & - \FILE{fermat\_little\_th.ma} \\ - \FILE{totient.ma} & & \\ - \end{tabular} - \caption{Scripts on natural numbers in the standard library\strut} - \label{tab:scripts} -\end{table} - -\section{Conclusions and future work} -\label{sec:conclusion} - -In this paper we have described the architecture of the \MATITA{} proof -assistant and the peculiarities of its user interface. \MATITA{} is -characterized by the central role played by its library and the integration -with modern Mathematical Knowledge Management tools and techniques. -Among them: an innovative search technique that is addressed to both -end-users and automatic proof search procedures; a \MATHML-based -rendering interface coupled with a disambiguating parser to fully exploit -the usual mathematical notation, using constants and symbols from the -whole library; direct manipulation of terms in the GUI, -with a corresponding elaborated -textual syntax to record user actions; fine-grained execution of -tacticals to simplify script structuring and replaying; a browsing -tool to navigate the hypertext formed by the concepts in the library, -integrated with natural language rendering of proofs; compatibility with -the library of the \COQ{} proof assistant. - -In the near future we plan to continue the development focusing on and enhancing -the peculiarities of \MATITA, starting from its document-centric philosophy. -In particular, we do not plan to maintain the library in a centralized way, -as most of the other systems do. On the contrary we are currently developing -Wiki-technologies to support collaborative development of the library, -encouraging people to expand, modify and elaborate previous contributions. -As a first step in this direction, we will integrate \MATITA{} with a -revision control system, building on the invalidation and re-generation -concepts already implemented to grant logical consistency. - -Thanks to an increasing dissemination activity, we hope in the\linebreak medium -term to attract users to form a critical mass of users to enter in -direct competition with the (still too few) major actors in the field. -To this aim, we are also progressing in the development of a core library, -mainly to identify possible unnoticed problems and to give evidence of the -usability of the system. - -\acknowledgements -We would like to thank all the people that during the past -7 years collaborated in the \HELM{} project and contributed to -the development of \MATITA{}, and in particular -M.~Galat\`a, A.~Griggio, F.~Guidi, P.~Di~Lena, L.~Padovani, I.~Schena, M.~Selmi, -and V.~Tamburrelli. - -\theendnotes - -\bibliography{matita} - -\end{document}