X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2Fpapers%2Fmatita%2Fmatita.tex;h=8ed71abe69bee79f4b2a8763c43d8fa67fbc8abd;hb=2a81818cb4b942c35a1cfa88121e561309e59172;hp=e298dad21197c1a118a63b7c7a1c27ac2a841dc8;hpb=ceb977d9f052e99b0e473c5484578eaaf307a4f6;p=helm.git diff --git a/helm/papers/matita/matita.tex b/helm/papers/matita/matita.tex index e298dad21..8ed71abe6 100644 --- a/helm/papers/matita/matita.tex +++ b/helm/papers/matita/matita.tex @@ -43,6 +43,7 @@ \institute{Department of Computer Science, University of Bologna\\ Mura Anteo Zamboni, 7 --- 40127 Bologna, ITALY\\ \email{$\{$asperti,sacerdot,tassi,zacchiro$\}$@cs.unibo.it}} +\bibliographystyle{plain} \begin{document} \maketitle @@ -53,7 +54,7 @@ \section{Introduction} \label{sec:intro} {\em Matita} is the proof assistant under development by the \HELM{} team -\cite{annals} at the University of Bologna, under the direction of +\cite{mkm-helm} at the University of Bologna, under the direction of Prof.~Asperti. The origin of the system goes back to 1999. At the time we were mostly interested to develop tools and techniques to enhance the accessibility @@ -80,7 +81,7 @@ active in the MathML Working group since 1999, and developed inside \HELM{} a MathML-compliant widget for the GTK graphical environment which can be integrated in any application. \end{itemize} -The exportation issue, extensively discussed in \cite{exportation}, +The exportation issue, extensively discussed in \cite{exportation-module}, has several major implications worth to be discussed. The first @@ -209,7 +210,7 @@ reduce our code in sensible way).\NOTE{righe\\\COQ{}} \ASSIGNEDTO{zack} \begin{table} - \caption{\label{tab:termsyn} Concrete syntax of CIC terms in \MATITA{}.\strut} + \caption{\label{tab:termsyn} Concrete syntax of CIC terms: built-in notation\strut} \hrule \[ \begin{array}{@{}rcll@{}} @@ -254,31 +255,94 @@ reduce our code in sensible way).\NOTE{righe\\\COQ{}} \hrule \end{table} +\subsubsection{Term input} +The primary form of user interaction employed by \MATITA{} is textual script +editing: the user can modifies it and evaluate step by step its composing +\emph{statements}. Examples of statements are inductive type definitions, +theorem declarations, LCF-style tacticals, and macros (e.g. \texttt{Check} can +be used to ask the system to refine a given term and pretty print the result). +Since many statements refer to terms of the underlying calculus, \MATITA{} needs +a concrete syntax able to encode terms of the Calculus of Inductive +Constructions. +Two of the requirements in the design of such a syntax are apparently in +contrast: +\begin{enumerate} + \item the syntax should be as close as possible to common mathematical practice + and implement widespread mathematical notions; + \item each term described by the syntax should be non-ambiguous meaning that it + should exists a function which associates to each term of the syntax a CIC + term. +\end{enumerate} + +These two requirements are addressed in \MATITA{} by the mean of two mechanisms +which work together: \emph{term disambiguation} and \emph{extensible notation}. +Their interaction is visible in the architecture of the \MATITA{} input phase, +depicted in Fig.~\ref{fig:inputphase}. The architecture is articulated as a +pipline of three levels: the concrete syntax level (level 0) is the one the user +has to deal with when inserting CIC terms; the abstract syntax level (level 2) +is an internal representation which intuitively encodes mathematical formulae at +the content level \NOTE{rif. per\\ content}; the formal mathematics level (level +3) is the CIC encoding of terms. + +Requirement (1) is addressed by a built-in concrete syntax for terms, described +in Tab.~\ref{tab:termsyn}, and the extensible notation mechanisms which offers a +way for extending available mathematical notations. Extensible notation, which +is also in charge of providing a parsing function mapping concrete syntax terms +to content level terms, is described in Sect.~\ref{sec:notation}. Requirement +(2) is addressed by the conjunct action of that parsing function and +disambiguation which provides a function from content level terms to CIC terms. + +\subsubsection{Sources of ambiguity} + +The translation from content level terms to CIC terms is not straightforward +because some nodes of the content encoding admit more that one CIC encoding, +invalidating requirement (2). + +\begin{example} + + Consider the term \texttt{\TEXMACRO{forall} x. x + ln 1 = x}, the type of a + lemma the user may want to prove. Assuming that both \texttt{+} and \texttt{=} + are parsed as infix operators, all the following questions are legitimate and + must be answered before obtaining a CIC term from its content level encoding + (Fig.~\ref{fig:inputphase}(b)): + + \begin{enumerate} -The disambiguation phase builds CIC terms from ASTs of user inputs (also called -\emph{ambiguous terms}). Ambiguous terms may carry three different \emph{sources -of ambiguity}: unbound identifiers, literal numbers, and literal symbols. -\emph{Unbound identifiers} are sources of ambiguity since the same name -could have been used to represent different objects. For example, three -different theorems of the \COQ{} library share the name $\mathit{plus\_assoc}$ -(locating them is an exercise for the interested reader. Hint: use \WHELP's -\LOCATE{} query). + \item Since \texttt{ln} is an unbound identifier, which CIC constants does it + represent? Many different theorems in the library may share its (rather + short) name \dots -\emph{Numbers} are ambiguous since several different encodings of them could be -provided in logical systems. In the \COQ{} standard library for example we found -naturals (in their unary encoding), positives (binary encoding), integers -(signed positives), and reals. Finally, \emph{symbols} (instances of the -\emph{binop} and \emph{unop} syntactic categories of Table~\ref{tab:syntax}) are -ambiguous as well: infix \texttt{+} for example is overloaded to represent -addition over the four different kinds of numbers available in the \COQ{} -standard library. Note that given a term with more than one sources of -ambiguity, not all possible disambiguation choices are valid: for example, given -the input \texttt{1+1} we must choose an interpretation of \texttt{+} which is -typable in CIC according to the chosen interpretation for \texttt{1}; choosing -as \texttt{+} the addition over natural numbers and as \texttt{1} the real -number $1$ will lead to a type error. + \item Which kind of number (\IN, \IR, \dots) the \texttt{1} literal stand for? + Which encoding is used in CIC to represent it? E.g., assuming $1\in\IN$, is + it an unary or a binary encoding? + + \item Which kind of equality the ``='' node represents? Is it Leibniz's + polymorhpic equality? Is it a decidable equality over \IN, \IR, \dots? + + \end{enumerate} + +\end{example} + +In \MATITA, three \emph{sources of ambiguity} are admitted for content level +terms: unbound identifiers, literal numbers, and literal symbols. + +\emph{Unbound identifiers} (question 1) are sources of ambiguity since the same +name could have been used in the proof assistant library to represent different +objects. \emph{Numbers} (question 2) are ambiguous since several different +encodings of them could be provided in the calculus. Finally, \emph{symbols} +(question 3) are ambiguous as well, since they may be used in an overloaded +fashion to represent the application of different objects. + +\textbf{FINQUI, il resto \`e copy and paste dal Whelp paper \dots} + +Note that given a content level term with more than one sources of ambiguity, +not all possible disambiguation choices are valid: for example, given the input +\texttt{1+1} we must choose an interpretation of \texttt{+} which is typable in +CIC according to the chosen interpretation for \texttt{1}; choosing as +\texttt{+} the addition over natural numbers and as \texttt{1} the real number +$1$ will lead to a type error. A \emph{disambiguation algorithm} takes as input an ambiguous term and return a fully determined CIC term. The \emph{naive disambiguation algorithm} takes as @@ -315,7 +379,7 @@ metavariables and a refiner can be implemented. This is the case for CIC and several other logics. \emph{Metavariables}~\cite{munoz} are typed, non linear placeholders that can occur in terms; $?_i$ usually denotes the $i$-th metavariable, while $?$ denotes -a freshly created metavariable. A \emph{refiner}~\cite{mcbride} is a +a freshly created metavariable. A \emph{refiner}~\cite{McBride} is a function whose input is a term with placeholders and whose output is either a new term obtained instantiating some placeholder or $\epsilon$, meaning that no well typed instantiation could be found for the placeholders occurring in @@ -359,6 +423,7 @@ be found in~\cite{disambiguation}, where an equivalent algorithm that avoids backtracking is also presented. \subsection{notazione} +\label{sec:notation} \ASSIGNEDTO{zack} \subsection{libreria tutta visibile} @@ -406,51 +471,7 @@ the development of Matita, and in particular A.Griggio, F.Guidi, P. Di Lena, L.Padovani, I.Schena, M.Selmi, V.Tamburrelli. - -\begin{thebibliography}{} - - \bibitem{ida}A.Asperti, H.Geuvers, I.Loeb, L.E.Mamane, C.Sacerdoti Coen. - An Interactive Algebra Course with Formalised Proofs and Definitions. - Post-Proceedings of the Fourth International Conference on - Mathemtical Knowledge Management. Bremen, Germany, July 2005. LNCS, - to appear. - - \bibitem{annals} A.~Asperti, F.~Guidi, L.~Padovani, C.~Sacerdoti Coen, - I.~Schena. \emph{Mathematical Knowledge Management in HELM}. Annals of - Mathematics and Artificial Intelligence, 38(1): 27--46; May 2003. - - \bibitem{whelp} A.~Asperti, F.~Guidi, C.~Sacerdoti Coen, - E.Tassi, S.Zacchiroli. \emph{A content based mathematical search - engine: whelp}. Post-proceedings of the Types 2004 International - Conference, Jouy-en-Josas, France, December 2004. LNCS (to appear). - - \bibitem{metadata2} A. Asperti, M. Selmi. \emph{Efficient Retrieval of - Mathematical Statements}. In Proceeding of the Third International Conference - on Mathematical Knowledge Management, MKM 2004. Bialowieza, Poland. LNCS 3119. - \bibitem{pechino} A.Asperti, B.Wegner. \emph{An Approach to - Machine-Understandable Representation of the Mathematical Information in - Digital Documents}. In: Fengshai Bai and Bernd Wegner (eds.): Electronic - Information and Communication in Mathematics, LNCS vol. 2730, - pp. 14--23, 2003 - - \bibitem{coq} The Coq proof-assistant, \url{http://coq.inria.fr} - - \bibitem{metadata1} F. Guidi, C. Sacerdoti Coen. \emph{Querying Distributed - Digital Libraries of Mathematics}. In Proceedings of Calculemus 2003, 11th - Symposium on the Integration of Symbolic Computation and Mechanized - Reasoning. Aracne Editrice. - - \bibitem{exportation} C. Sacerdoti Coen. \emph{From Proof-Assistans to - Distributed Libraries of Mathematics: Tips and Pitfalls}. - In Proc. Mathematical Knowledge Management 2003, Lecture Notes in Computer - Science, Vol. 2594, pp. 30--44, Springer-Verlag. - - \bibitem{disambiguation} C. Sacerdoti Coen, S. Zacchiroli. \emph{Efficient - Ambiguous Parsing of Mathematical Formulae}. In Proceedings of the Third - International Conference on Mathematical Knowledge Management, MKM 2004. - LNCS,3119. - -\end{thebibliography} +\bibliography{matita} \end{document}