helm/papers/matita/matita.tex

   1 \documentclass[a4paper]{llncs}
   2 \pagestyle{headings}
   3 \usepackage{graphicx}
   4 \usepackage{amssymb,amsmath}
   5 \usepackage{hyperref}
   6 \usepackage{picins}
   7
   8 %\newcommand{\logo}[3]{
   9 %\parpic(0cm,0cm)(#2,#3)[l]{\includegraphics[width=#1]{whelp-bw}}
  10 %}
  11
  12 \newcommand{\AUTO}{\textsc{Auto}}
  13 \newcommand{\COQ}{Coq}
  14 \newcommand{\ELIM}{\textsc{Elim}}
  15 \newcommand{\HELM}{Helm}
  16 \newcommand{\HINT}{\textsc{Hint}}
  17 \newcommand{\IN}{\ensuremath{\mathbb{N}}}
  18 \newcommand{\INSTANCE}{\textsc{Instance}}
  19 \newcommand{\IR}{\ensuremath{\mathbb{R}}}
  20 \newcommand{\LIBXSLT}{LibXSLT}
  21 \newcommand{\LOCATE}{\textsc{Locate}}
  22 \newcommand{\MATCH}{\textsc{Match}}
  23 \newcommand{\MATITA}{Matita}
  24 \newcommand{\METAHEADING}{Symbol & Position \\ \hline\hline}
  25 \newcommand{\MOWGLI}{MoWGLI}
  26 \newcommand{\NAT}{\ensuremath{\mathit{nat}}}
  27 \newcommand{\NATIND}{\mathit{nat\_ind}}
  28 \newcommand{\OCAML}{OCaml}
  29 \newcommand{\PROP}{\mathit{Prop}}
  30 \newcommand{\REF}[3]{\ensuremath{\mathit{Ref}_{#1}(#2,#3)}}
  31 \newcommand{\TEXMACRO}[1]{\texttt{\char92 #1}}
  32 \newcommand{\UWOBO}{UWOBO}
  33 \newcommand{\WHELP}{Whelp}
  34
  35 \newcommand{\ASSIGNEDTO}[1]{\textbf{Assigned to:} #1}
  36
  37 \title{The Matita proof assistant}
  38 \author{Andrea Asperti, Claudio Sacerdoti Coen, Enrico Tassi
  39  and Stefano Zacchiroli}
  40 \institute{Department of Computer Science, University of Bologna\\
  41  Mura Anteo Zamboni, 7 --- 40127 Bologna, ITALY\\
  42  \email{$\{$asperti,sacerdot,tassi,zacchiro$\}$@cs.unibo.it}}
  43
  44 \begin{document}
  45 \maketitle
  46
  47 \begin{abstract}
  48 \end{abstract}
  49
  50 \section{Introduction}
  51 \label{sec:intro}
  52 {\em Matita} is the proof assistant under development by the Helm team
  53 \cite{annals} at the University of Bologna, under the direction of
  54 Prof.Asperti.
  55 The origin of the system goes back to 1999. At the time we were mostly
  56 interested to develop tools and techniques to enhance the accessibility
  57 via web of formal libraries of mathematics. Due to its dimension, the
  58 library of the coq proof assistant (of the order of 35000 theorems)
  59 was choosed as a privileged test bench for our work, although experiments
  60 have been also conducted with other systems, and notably with Nuprl.
  61 The work, mostly performed in the framework of the recently concluded
  62 European project IST-33562-Mowgli \cite{pechino}, mainly consisted in the
  63 following teps:
  64 \begin{itemize}
  65 \item exporting the information from the internal representation of
  66 Coq to a system and platform independent format. Since XML was at the
  67 time an emerging standard, we naturally adopted this technology, fostering
  68 a content-based architecture for future system, where the documents
  69 of the library were the the main components around which everything else
  70 has to be build;
  71 \item developing indexing and searching techniques supporting semantic
  72 queries to the library; these efforts gave birth to our {\em whelp}
  73 search engine, described in \cite{whelp};
  74 \item developing languages and tools for a high-quality notational
  75 rendering of mathematical information; in particular, we have been
  76 active in the MathML Working group since 1999, and developed inside
  77 Helm a MathML-compliant widget for the GTK graphical environment
  78 which can be integrated in any application.
  79 \end{itemize}
  80 The exportation issue, extensively discussed in \cite{exportation},
  81 has several major implications worth to be discussed.
  82
  83 The first
  84 point concern the kind of content information to be exported. In a
  85 proof assistant like coq, proofs are represented in at least three clearly
  86 distinguishable formats: scripts (i.e. sequences of commands issued by the
  87 user to the system during an interactive session of proof), proof objects
  88 (which is the low-level representation of proofs in the form of
  89 lambda-terms readable to and checked by kernel) and proof-trees (which
  90 is a kind of intermediate representation, vaguely inspired by a sequent
  91 like notation, that inherits most of the defects but essentially
  92 none of the advantages of the previous representations).
  93 Partially related to this problem, there is the
  94 issue of the {\em granularity} of the library: scripts usually comprise
  95 small developments with many definitions and theorems, while
  96 proof objects correspond to individual mathemtical items.
  97
  98 In our case, the choice of the content encoding was eventually dictated
  99 by the methodological assumption of offering the information in a
 100 stable and system independent format. The language of scripts is too
 101 oriented to Coq, and it changes too rapidly to be of any interest
 102 to third parties. On the other side, the language of proof objects
 103 merely depend on
 104 the logical framework (the Calculus of Inductive Constructions, in
 105 the case of Coq), is grammatically simple, semantically clear and,
 106 especially, is very stable (as the kernel of the proof assistants
 107 often is).
 108 So the granularity of the library is at the level of individual
 109 objects, that also justifies from another point of view the need
 110 for efficient searching techniques for retrieving individual
 111 logical items from the repository.
 112
 113 The main (possibly only) problem with proof objects is that they are
 114 difficult to read and do not directly correspond to what the user typed
 115 in. An analogy frequently made in the proof assistant community is that of
 116 comparing the vernacular language of scripts to a high level source language
 117 and lambda terms to the assembly language they are compiled in, We do not
 118 share this view and prefer to look at scripts as an imperative language,
 119 and to lambda terms as their denotational semantics; still, however,
 120 denotational semantics is possibly more formal but surely not more readable
 121 than the imperative source.
 122
 123 For all the previous reasons, a huge amount of work inside Mowgli has
 124 been devoted to automatic reconstruction of proofs in natural language
 125 from lambda terms. Since lambda terms are in close connection
 126 with natural deduction
 127 (thay is still the most natural logical language discovered so far)
 128 the work is not hopeless as it may seem, especially if rendering
 129 is combined, as in our case, with dynamic features supporting
 130 in-line expansions or contraction of subproofs. The final
 131 rendering is probably not entirely satisfactory (see \cite{ida} for a
 132 discussion), but surely
 133 readable (the actual quality largely depends by the way the lambda
 134 term is written).
 135
 136 Summing up, we already disposed of the following tools/techniques:
 137 \begin{itemize}
 138 \item XML specifications for the Calculus of Inductive Constructions,
 139 with tools for parsing and saving mathematical objects in such a format;
 140 \item metadata specifications and tools for indexing and querying the
 141 XML knowledge base;
 142 \item a proof checker (i.e. the {\em kernel} of a proof assistant),
 143 implemented to check that we exported form the coq library all the
 144 logically relevant content;
 145 \item a sophisticated parser (used by the search engine), able to deal
 146 with potentially ambiguous and incomplete information, typical of the
 147 mathematical notation \cite{};
 148 \item a {\em refiner}, i.e. a type inference system, based on complex
 149 existential variables, used by the disambiguating parser;
 150 \item complex transformation algorithms for proof rendering in natural
 151 language;
 152 \item an innovative rendering widget, supporting high-quality bidimensional
 153 rendering, and semantic selection, i.e. the possibility to select semantically
 154 meaningfull rendering expressions, and to past the respective contebt into
 155 a different text area.
 156 \end{itemize}
 157 Starting from all this, the further step of developing our own
 158 proof assistant was too
 159 small and too tempting to be neglected. Essentially, we ``just'' had to
 160 add an authoring interface, and a set of functionalities for the
 161 overall management of the library, integrating everything into a
 162 single system. {\em Matita} is the result of this effort.
 163
 164 At first sight, Matita looks as (and partly is) a Coq clone. This is
 165 more the effect of the circumstances of its creation described
 166 above than the result of a deliberate design. In particular, we
 167 (essentially) share the same foundational dialect of Coq (the
 168 Calculus of Inductive Constructions), the same implementative
 169 language (Ocaml), and the same (script based) authoring philosophy.
 170 However, as we shall see, the analogy essentially stops here.
 171
 172 In a sense; we like to think of Matita as the way Coq would
 173 look like if entirely rerwritten from scratch: just to give an
 174 idea, although Matita currently supports almost all functionalities of
 175 Coq, it links 60000 lins of coaml code, against ... of Coq (and
 176 we are convinced that, starting from scratch againg, we could furtherly
 177 reduce our code in sensible way).
 178
 179
 180 \begin{itemize}
 181  \item scelta del sistema fondazionale
 182  \item sistema indipendente (da Coq)
 183   \begin{itemize}
 184    \item possibilit\`a di sperimentare (soluzioni architetturali, logiche,
 185     implementative, \dots)
 186    \item compatibilit\`a con sistemi legacy
 187   \end{itemize}
 188 \end{itemize}
 189
 190 \section{Features}
 191
 192 \subsection{mathml}
 193 \ASSIGNEDTO{zack}
 194
 195 \subsection{metavariabili}
 196 \ASSIGNEDTO{csc}
 197
 198 \subsection{pattern}
 199 \ASSIGNEDTO{gares}
 200
 201 \subsection{tatticali}
 202 \ASSIGNEDTO{gares}
 203
 204 \subsection{disambiguazione}
 205 \ASSIGNEDTO{zack}
 206
 207 \subsection{notazione}
 208 \ASSIGNEDTO{zack}
 209
 210 \subsection{libreria tutta visibile}
 211 \ASSIGNEDTO{csc}
 212
 213 \subsection{ricerca e indicizzazione}
 214 \ASSIGNEDTO{andrea}
 215
 216 \subsection{auto}
 217 \ASSIGNEDTO{andrea}
 218
 219 \subsection{xml / gestione della libreria}
 220 \ASSIGNEDTO{gares}
 221
 222 \subsection{named variable}
 223 \ASSIGNEDTO{csc}
 224
 225 \subsection{assenza di proof tree / resa in linguaggio naturale}
 226 \ASSIGNEDTO{andrea}
 227
 228 \subsection{selezione semantica, cut paste, hyperlink}
 229 \ASSIGNEDTO{zack}
 230
 231 \subsection{sostituzioni esplicite vs moduli}
 232 \ASSIGNEDTO{csc}
 233
 234 \section{Drawbacks, missing, \dots}
 235
 236 \subsection{moduli}
 237 \ASSIGNEDTO{}
 238
 239 \subsection{ltac}
 240 \ASSIGNEDTO{}
 241
 242 \subsection{estrazione}
 243 \ASSIGNEDTO{}
 244
 245 \subsection{localizzazione errori}
 246 \ASSIGNEDTO{}
 247
 248 \textbf{Acknowledgements}
 249 We would like to thank all the students that during the past
 250 five years collaborated in the \HELM{} project and contributed to
 251 the development of Matita, and in particular
 252 A.Griggio, F.Guidi, P. Di Lena, L.Padovani, I.Schena, M.Selmi,
 253 V.Tamburrelli.
 254
 255
 256 \begin{thebibliography}{}
 257
 258  \bibitem{ida}A.Asperti, H.Geuvers, I.Loeb, L.E.Mamane, C.Sacerdoti Coen.
 259   An Interactive Algebra Course with Formalised Proofs and Definitions.
 260   Post-Proceedings of the Fourth International Conference on
 261   Mathemtical Knowledge Management. Bremen, Germany, July 2005. LNCS,
 262   to appear.
 263
 264  \bibitem{annals} A.~Asperti, F.~Guidi, L.~Padovani, C.~Sacerdoti Coen,
 265   I.~Schena. \emph{Mathematical Knowledge Management in HELM}. Annals of
 266   Mathematics and Artificial Intelligence, 38(1): 27--46; May 2003.
 267
 268  \bibitem{whelp} A.~Asperti, F.~Guidi, C.~Sacerdoti Coen,
 269   E.Tassi, S.Zacchiroli. \emph{A content based mathematical search
 270   engine: whelp}. Post-proceedings of the Types 2004 International
 271   Conference, Jouy-en-Josas, France, December 2004. LNCS (to appear).
 272
 273  \bibitem{metadata2} A. Asperti, M. Selmi. \emph{Efficient Retrieval of
 274   Mathematical Statements}. In Proceeding of the Third International Conference
 275   on Mathematical Knowledge Management, MKM 2004. Bialowieza, Poland. LNCS 3119.
 276  \bibitem{pechino} A.Asperti, B.Wegner. \emph{An Approach to
 277   Machine-Understandable Representation of the Mathematical Information in
 278   Digital Documents}.  In: Fengshai Bai and Bernd Wegner (eds.): Electronic
 279   Information and Communication in Mathematics, LNCS vol. 2730,
 280   pp. 14--23, 2003
 281
 282  \bibitem{coq} The Coq proof-assistant, \url{http://coq.inria.fr}
 283
 284  \bibitem{metadata1} F. Guidi, C. Sacerdoti Coen. \emph{Querying Distributed
 285   Digital Libraries of Mathematics}. In Proceedings of Calculemus 2003, 11th
 286   Symposium on the Integration of Symbolic Computation and Mechanized
 287   Reasoning. Aracne Editrice.
 288
 289  \bibitem{exportation} C. Sacerdoti Coen. \emph{From Proof-Assistans to
 290   Distributed Libraries of Mathematics: Tips and Pitfalls}.
 291   In Proc. Mathematical Knowledge Management 2003, Lecture Notes in Computer
 292   Science, Vol. 2594, pp. 30--44, Springer-Verlag.
 293
 294  \bibitem{disambiguation} C. Sacerdoti Coen, S. Zacchiroli. \emph{Efficient
 295   Ambiguous Parsing of Mathematical Formulae}. In Proceedings of the Third
 296   International Conference on Mathematical Knowledge Management, MKM 2004.
 297   LNCS,3119.
 298
 299 \end{thebibliography}
 300
 301 \end{document}
 302