\institute{Department of Computer Science, University of Bologna\\
Mura Anteo Zamboni, 7 --- 40127 Bologna, ITALY\\
\email{$\{$asperti,sacerdot,tassi,zacchiro$\}$@cs.unibo.it}}
+\bibliographystyle{plain}
\begin{document}
\maketitle
\section{Introduction}
\label{sec:intro}
{\em Matita} is the proof assistant under development by the \HELM{} team
-\cite{annals} at the University of Bologna, under the direction of
+\cite{mkm-helm} at the University of Bologna, under the direction of
Prof.~Asperti.
The origin of the system goes back to 1999. At the time we were mostly
interested to develop tools and techniques to enhance the accessibility
\HELM{} a MathML-compliant widget for the GTK graphical environment
which can be integrated in any application.
\end{itemize}
-The exportation issue, extensively discussed in \cite{exportation},
+The exportation issue, extensively discussed in \cite{exportation-module},
has several major implications worth to be discussed.
The first
\ASSIGNEDTO{zack}
\begin{table}
- \caption{\label{tab:termsyn} Concrete syntax of CIC terms in \MATITA{}.\strut}
+ \caption{\label{tab:termsyn} Concrete syntax of CIC terms: built-in notation\strut}
\hrule
\[
\begin{array}{@{}rcll@{}}
\hrule
\end{table}
+\subsubsection{Term input}
+The primary form of user interaction employed by \MATITA{} is textual script
+editing: the user can modifies it and evaluate step by step its composing
+\emph{statements}. Examples of statements are inductive type definitions,
+theorem declarations, LCF-style tacticals, and macros (e.g. \texttt{Check} can
+be used to ask the system to refine a given term and pretty print the result).
+Since many statements refer to terms of the underlying calculus, \MATITA{} needs
+a concrete syntax able to encode terms of the Calculus of Inductive
+Constructions.
+Two of the requirements in the design of such a syntax are apparently in
+contrast:
+\begin{enumerate}
+ \item the syntax should be as close as possible to common mathematical practice
+ and implement widespread mathematical notions;
+ \item each term described by the syntax should be non-ambiguous meaning that it
+ should exists a function which associates to each term of the syntax a CIC
+ term.
+\end{enumerate}
+
+These two requirements are addressed in \MATITA{} by the mean of two mechanisms
+which work together: \emph{term disambiguation} and \emph{extensible notation}.
+Their interaction is visible in the architecture of the \MATITA{} input phase,
+depicted in Fig.~\ref{fig:inputphase}. The architecture is articulated as a
+pipline of three levels: the concrete syntax level (level 0) is the one the user
+has to deal with when inserting CIC terms; the abstract syntax level (level 2)
+is an internal representation which intuitively encodes mathematical formulae at
+the content level \NOTE{rif. per\\ content}; the formal mathematics level (level
+3) is the CIC encoding of terms.
+
+Requirement (1) is addressed by a built-in concrete syntax for terms, described
+in Tab.~\ref{tab:termsyn}, and the extensible notation mechanisms which offers a
+way for extending available mathematical notations. Extensible notation, which
+is also in charge of providing a parsing function mapping concrete syntax terms
+to content level terms, is described in Sect.~\ref{sec:notation}. Requirement
+(2) is addressed by the conjunct action of that parsing function and
+disambiguation which provides a function from content level terms to CIC terms.
+
+\subsubsection{Sources of ambiguity}
+
+The translation from content level terms to CIC terms is not straightforward
+because some nodes of the content encoding admit more that one CIC encoding,
+invalidating requirement (2).
+
+\begin{example}
+
+ Consider the term \texttt{\TEXMACRO{forall} x. x + ln 1 = x}, the type of a
+ lemma the user may want to prove. Assuming that both \texttt{+} and \texttt{=}
+ are parsed as infix operators, all the following questions are legitimate and
+ must be answered before obtaining a CIC term from its content level encoding
+ (Fig.~\ref{fig:inputphase}(b)):
+
+ \begin{enumerate}
-The disambiguation phase builds CIC terms from ASTs of user inputs (also called
-\emph{ambiguous terms}). Ambiguous terms may carry three different \emph{sources
-of ambiguity}: unbound identifiers, literal numbers, and literal symbols.
-\emph{Unbound identifiers} are sources of ambiguity since the same name
-could have been used to represent different objects. For example, three
-different theorems of the \COQ{} library share the name $\mathit{plus\_assoc}$
-(locating them is an exercise for the interested reader. Hint: use \WHELP's
-\LOCATE{} query).
+ \item Since \texttt{ln} is an unbound identifier, which CIC constants does it
+ represent? Many different theorems in the library may share its (rather
+ short) name \dots
-\emph{Numbers} are ambiguous since several different encodings of them could be
-provided in logical systems. In the \COQ{} standard library for example we found
-naturals (in their unary encoding), positives (binary encoding), integers
-(signed positives), and reals. Finally, \emph{symbols} (instances of the
-\emph{binop} and \emph{unop} syntactic categories of Table~\ref{tab:syntax}) are
-ambiguous as well: infix \texttt{+} for example is overloaded to represent
-addition over the four different kinds of numbers available in the \COQ{}
-standard library. Note that given a term with more than one sources of
-ambiguity, not all possible disambiguation choices are valid: for example, given
-the input \texttt{1+1} we must choose an interpretation of \texttt{+} which is
-typable in CIC according to the chosen interpretation for \texttt{1}; choosing
-as \texttt{+} the addition over natural numbers and as \texttt{1} the real
-number $1$ will lead to a type error.
+ \item Which kind of number (\IN, \IR, \dots) the \texttt{1} literal stand for?
+ Which encoding is used in CIC to represent it? E.g., assuming $1\in\IN$, is
+ it an unary or a binary encoding?
+
+ \item Which kind of equality the ``='' node represents? Is it Leibniz's
+ polymorhpic equality? Is it a decidable equality over \IN, \IR, \dots?
+
+ \end{enumerate}
+
+\end{example}
+
+In \MATITA, three \emph{sources of ambiguity} are admitted for content level
+terms: unbound identifiers, literal numbers, and literal symbols.
+
+\emph{Unbound identifiers} (question 1) are sources of ambiguity since the same
+name could have been used in the proof assistant library to represent different
+objects. \emph{Numbers} (question 2) are ambiguous since several different
+encodings of them could be provided in the calculus. Finally, \emph{symbols}
+(question 3) are ambiguous as well, since they may be used in an overloaded
+fashion to represent the application of different objects.
+
+\textbf{FINQUI, il resto \`e copy and paste dal Whelp paper \dots}
+
+Note that given a content level term with more than one sources of ambiguity,
+not all possible disambiguation choices are valid: for example, given the input
+\texttt{1+1} we must choose an interpretation of \texttt{+} which is typable in
+CIC according to the chosen interpretation for \texttt{1}; choosing as
+\texttt{+} the addition over natural numbers and as \texttt{1} the real number
+$1$ will lead to a type error.
A \emph{disambiguation algorithm} takes as input an ambiguous term and return a
fully determined CIC term. The \emph{naive disambiguation algorithm} takes as
several other logics.
\emph{Metavariables}~\cite{munoz} are typed, non linear placeholders that can
occur in terms; $?_i$ usually denotes the $i$-th metavariable, while $?$ denotes
-a freshly created metavariable. A \emph{refiner}~\cite{mcbride} is a
+a freshly created metavariable. A \emph{refiner}~\cite{McBride} is a
function whose input is a term with placeholders and whose output is either a
new term obtained instantiating some placeholder or $\epsilon$, meaning that no
well typed instantiation could be found for the placeholders occurring in
that avoids backtracking is also presented.
\subsection{notazione}
+\label{sec:notation}
\ASSIGNEDTO{zack}
\subsection{libreria tutta visibile}
A.Griggio, F.Guidi, P. Di Lena, L.Padovani, I.Schena, M.Selmi,
V.Tamburrelli.
-
-\begin{thebibliography}{}
-
- \bibitem{ida}A.Asperti, H.Geuvers, I.Loeb, L.E.Mamane, C.Sacerdoti Coen.
- An Interactive Algebra Course with Formalised Proofs and Definitions.
- Post-Proceedings of the Fourth International Conference on
- Mathemtical Knowledge Management. Bremen, Germany, July 2005. LNCS,
- to appear.
-
- \bibitem{annals} A.~Asperti, F.~Guidi, L.~Padovani, C.~Sacerdoti Coen,
- I.~Schena. \emph{Mathematical Knowledge Management in HELM}. Annals of
- Mathematics and Artificial Intelligence, 38(1): 27--46; May 2003.
-
- \bibitem{whelp} A.~Asperti, F.~Guidi, C.~Sacerdoti Coen,
- E.Tassi, S.Zacchiroli. \emph{A content based mathematical search
- engine: whelp}. Post-proceedings of the Types 2004 International
- Conference, Jouy-en-Josas, France, December 2004. LNCS (to appear).
-
- \bibitem{metadata2} A. Asperti, M. Selmi. \emph{Efficient Retrieval of
- Mathematical Statements}. In Proceeding of the Third International Conference
- on Mathematical Knowledge Management, MKM 2004. Bialowieza, Poland. LNCS 3119.
- \bibitem{pechino} A.Asperti, B.Wegner. \emph{An Approach to
- Machine-Understandable Representation of the Mathematical Information in
- Digital Documents}. In: Fengshai Bai and Bernd Wegner (eds.): Electronic
- Information and Communication in Mathematics, LNCS vol. 2730,
- pp. 14--23, 2003
-
- \bibitem{coq} The Coq proof-assistant, \url{http://coq.inria.fr}
-
- \bibitem{metadata1} F. Guidi, C. Sacerdoti Coen. \emph{Querying Distributed
- Digital Libraries of Mathematics}. In Proceedings of Calculemus 2003, 11th
- Symposium on the Integration of Symbolic Computation and Mechanized
- Reasoning. Aracne Editrice.
-
- \bibitem{exportation} C. Sacerdoti Coen. \emph{From Proof-Assistans to
- Distributed Libraries of Mathematics: Tips and Pitfalls}.
- In Proc. Mathematical Knowledge Management 2003, Lecture Notes in Computer
- Science, Vol. 2594, pp. 30--44, Springer-Verlag.
-
- \bibitem{disambiguation} C. Sacerdoti Coen, S. Zacchiroli. \emph{Efficient
- Ambiguous Parsing of Mathematical Formulae}. In Proceedings of the Third
- International Conference on Mathematical Knowledge Management, MKM 2004.
- LNCS,3119.
-
-\end{thebibliography}
+\bibliography{matita}
\end{document}