first part on disambiguation passes

author Stefano Zacchiroli <zack@upsilon.cc>

Mon, 28 Nov 2005 13:01:50 +0000 (13:01 +0000)

committer Stefano Zacchiroli <zack@upsilon.cc>

Mon, 28 Nov 2005 13:01:50 +0000 (13:01 +0000)
author Stefano Zacchiroli <zack@upsilon.cc>
Mon, 28 Nov 2005 13:01:50 +0000 (13:01 +0000)
committer Stefano Zacchiroli <zack@upsilon.cc>
Mon, 28 Nov 2005 13:01:50 +0000 (13:01 +0000)
diff --git a/helm/papers/matita/matita.bib b/helm/papers/matita/matita.bib

index f98e9c431b865614f22e1fb814f43de6a161fc15..5b53ac0b9adbc95f16e08c610277006cce23b59d 100644 (file)
--- a/helm/papers/matita/matita.bib
+++ b/helm/papers/matita/matita.bib
@@ -919,6 +919,12 @@
   key = "MoWGLI Proposal"
  }
  
+@misc{debrujinfactor,
+ title = "The ``de Bruijn factor''",
+ howpublished = "\\\url{http://www.cs.ru.nl/~freek/factor/}",
+ author = "Freek Wiedijk",
+}
+
  @misc{mowgli-deliverables,
   title = "{MoWGLI} Project Deliverables",
   howpublished = "\\\url{http://mowgli.cs.unibo.it/html_no_frames/deliverables/index.html}",
diff --git a/helm/papers/matita/matita2.tex b/helm/papers/matita/matita2.tex

index e2af8870cba8fd4ecfd17cbb8da013a7e19d0177..77b7c14062def045b52bff9dd46654fca3285ee9 100644 (file)
--- a/helm/papers/matita/matita2.tex
+++ b/helm/papers/matita/matita2.tex
@@ -250,6 +250,7 @@ In the future we plan to exploit \MATITA{} as a test bench for new ideas and
  extensions. Keeping the single libraries and the whole architecture as
  simple as possible is thus crucial to speed up future experiments and to
  allow other developers to quickly understand our code and contribute.
+
  %For direct experience of the authors, the learning curve to understand and
  %be able to contribute to \COQ{}'s code is quite steep and requires direct
  %and frequent interactions with \COQ{} developers.
@@ -428,6 +429,8 @@ an improvement over fully specified terms since they allow to omit redundant
  information that can be inferred by the refiner.
  
  \subsection{Content level terms}
+\label{sec:contentintro}
+
  The language used to communicate proofs and expecially expressions with the
  user does not only needs to be extendible and accomodate the usual mathematical
  notation. It must also reflect the comfortable degree of imprecision and
@@ -489,7 +492,7 @@ the ambiguity of the content level. As a consequence the translation
  is guided by an \emph{interpretation}, that is a function that chooses for
  every ambiguous expression one partially specified term. The
  \texttt{disambiguation} library contains the implementation of the
-disambiguation algorithm we presented in \cite{disambiguation} that is
+disambiguation algorithm we presented in~\cite{disambiguation} that is
  responsible of building in an efficicent way the set of all ``correct''
  interpretations. An interpretation is correct if the partially specified term
  obtained using the interpretation is refinable.
@@ -997,18 +1000,129 @@ even being a so simple idea:
    goal) gives you the feeling of what is going on.
  \end{description}
  
+\section{Content level terms}
+
+\subsection{Disambiguation}
+
+Software applications that involve input of mathematical content should strive
+to require the user as less drift from informal mathematics as possible. We
+believe this to be a fundamental aspect of such applications user interfaces.
+Being that drift in general very large when inputing
+proofs~\cite{debrujinfactor}, in \MATITA{} we achieved good results for
+mathematical formulae which can be input using a \TeX-like encoding (the
+concrete syntax corresponding to presentation level terms) and are then
+translated (in multiple steps) to partially specified terms as sketched in
+Sect.~\ref{sec:contentintro}.
+
+The key component of the translation is the generic disambiguation algorithm
+implemented in the \texttt{disambiguation} library of Fig.~\ref{fig:libraries}
+and presented in~\cite{disambiguation}. In this section we present how to use
+such an algorithm in the context of the development of a library of formalized
+mathematics. We proceed by examples took from the \MATITA{} standard library.
+
+\subsubsection{Disambiguation aliases}
+
+Let's start with the definition of the ``strictly greater then'' notion over
+(Peano) natural numbers.
+
+\begin{grafite}
+include "nat/nat.ma".
+..
+definition gt: nat \to nat \to Prop \def
+  \lambda n, m. m < n.
+\end{grafite}
+
+The \texttt{include} statement adds the requirement that the part of the library
+defining the notion of (Peano) natural numbers should be defined before
+processing the following definition. Note indeed that the algorithm presented
+in~\cite{disambiguation} does not describe where interpretations for ambiguous
+expressions come from, since it is application-specific. As a first
+approximation, we will assume that in \MATITA{} they come from the library (i.e.
+all interpretations available in the library are used) and the \texttt{include}
+statements are used to ensure the availability of required library slices (see
+Sect.~\ref{sec:libmanagement}).
+
+While processing the \texttt{gt} definition, \MATITA{} has to disambiguate two
+terms: its type and its body. Being available in the required library only one
+interpretation both for the unbound identifier \texttt{nat} and for the
+\texttt{<} operator, and being the resulting partially specified term refinable,
+both type and body are easily disambiguated.
+
+Now suppose we have defined integers as signed Peano numbers, and that we want
+to prove a theorem about an order relationship already defined on them (which of
+course overload the \texttt{<} operator):
+
+\begin{grafite}
+include "Z/z.ma".
+..
+theorem Zlt_compat:
+  \forall x, y, z. x < y \to y < z \to x < z.
+\end{grafite}
+
+Since integers are defined on top of Peano numbers, the part of the library
+concerning the latters is available when disambiguating \texttt{Zlt\_compat}'s
+type. Thus, according to the disambiguation algorithm, two different partially
+specified terms could be associated to it. At first, this might not be seen as a
+problem, since the user is asked and can choose interactively which of the two
+she had in mind. However in the long run it has the drawbacks of inhibiting
+batch compilation of the library (a technique used in \MATITA{} for behind the
+scene compilation when needed, e.g. when an \texttt{include} is issued) and
+yields to poor user interaction (imagine how tedious would be to be asked for a
+choice each time you re-evaluate \texttt{Zlt\_compat}!).
+
+For this reason we added to \MATITA{} the concept of \emph{disambiguation
+aliases}. Disambiguation aliases are one-to-many mappings from ambiguous
+expressions to partially specified terms, which are part of the runtime status
+of \MATITA. They can be provided by users with the \texttt{alias} statement, but
+are usually automatically added when evaluating \texttt{include} statements
+(\emph{implicit aliases}).  Moreover, \MATITA{} does it best to ensure that
+terms which require interactive choice are saved in batch compilable format.
+Thus, after evaluating the above theorem the script will be changed to the
+following snippet (assuming that the interpretation of \texttt{<} over integers
+has been choosed):
+
+\begin{grafite}
+alias symbol "lt" (instance 0) = "integer 'less than'".
+theorem Zlt_compat:
+  \forall x, y, z. x < y \to y < z \to x < z.
+\end{grafite}
+
+But how are disambiguation aliases used? Since they come from the parts of the
+library explicitely included we may be tempted of using them as the only
+available interpretations. This would speed up the disambiguation, but may fail.
+Consider for example:
+
+\begin{grafite}
+theorem lt_mono: \forall x, y, k. x < y \to x < y + k.
+\end{grafite}
+
+and suppose that the \texttt{+} operator is defined only on Peano numbers. If
+the alias for \texttt{<} points to the integer version of the operator, no
+refinable partially specified term matching the term could be found.
+
+For this reason we choosed to attempt \emph{multiple disambiguation passes}. A
+first pass attempt to disambiguate using the last available disambiguation
+aliases, in case of failure the next pass try again the disambiguation
+forgetting the aliases and using the whole library to retrieve interpretation
+for ambiguous expressions. Since the latter pass may lead to too many choices we
+intertwined an additional pass among the two which use as interpretations all
+the aliases coming for included parts of the library (this is the reason why
+aliases are \emph{one-to-many} mappings instead of one-to-one). This choice
+turned out to be a well-balanced trade-off among performances (earlier passes
+fail quickly) and degree of ambiguity supported for presentation level terms.
+
+\subsubsection{Operator instances}
  
  \acknowledgements
  We would like to thank all the students that during the past
  five years collaborated in the \HELM{} project and contributed to 
  the development of Matita, and in particular
-A.Griggio, F.Guidi, P. Di Lena, L.Padovani, I.Schena, M.Selmi, 
-V.Tamburrelli.
+A.~Griggio, F.~Guidi, P.~Di~Lena, L.~Padovani, I.~Schena, M.~Selmi, 
+and V.~Tamburrelli.
  
  \theendnotes
  
  \bibliography{matita}
  
-
  \end{document}
author	Stefano Zacchiroli <zack@upsilon.cc>
	Mon, 28 Nov 2005 13:01:50 +0000 (13:01 +0000)
committer	Stefano Zacchiroli <zack@upsilon.cc>
	Mon, 28 Nov 2005 13:01:50 +0000 (13:01 +0000)
helm/papers/matita/matita.bib		patch \| blob \| history
helm/papers/matita/matita2.tex		patch \| blob \| history