helm/papers/matita/matita.tex

   1 \documentclass[a4paper]{llncs}
   2 \pagestyle{headings}
   3 \usepackage{color}
   4 \usepackage{graphicx}
   5 \usepackage{amssymb,amsmath}
   6 \usepackage{hyperref}
   7 \usepackage{picins}
   8 \usepackage{fancyvrb}
   9
  10 %\newcommand{\logo}[3]{
  11 %\parpic(0cm,0cm)(#2,#3)[l]{\includegraphics[width=#1]{whelp-bw}}
  12 %}
  13
  14 \newcommand{\AUTO}{\textsc{Auto}}
  15 \newcommand{\COQ}{Coq}
  16 \newcommand{\ELIM}{\textsc{Elim}}
  17 \newcommand{\HELM}{Helm}
  18 \newcommand{\HINT}{\textsc{Hint}}
  19 \newcommand{\IN}{\ensuremath{\mathbb{N}}}
  20 \newcommand{\INSTANCE}{\textsc{Instance}}
  21 \newcommand{\IR}{\ensuremath{\mathbb{R}}}
  22 \newcommand{\IZ}{\ensuremath{\mathbb{Z}}}
  23 \newcommand{\LIBXSLT}{LibXSLT}
  24 \newcommand{\LOCATE}{\textsc{Locate}}
  25 \newcommand{\MATCH}{\textsc{Match}}
  26 \newcommand{\MATITA}{Matita}
  27 \newcommand{\METAHEADING}{Symbol & Position \\ \hline\hline}
  28 \newcommand{\MOWGLI}{MoWGLI}
  29 \newcommand{\NAT}{\ensuremath{\mathit{nat}}}
  30 \newcommand{\NATIND}{\mathit{nat\_ind}}
  31 \newcommand{\NUPRL}{NuPRL}
  32 \newcommand{\OCAML}{OCaml}
  33 \newcommand{\PROP}{\mathit{Prop}}
  34 \newcommand{\REF}[3]{\ensuremath{\mathit{Ref}_{#1}(#2,#3)}}
  35 \newcommand{\TEXMACRO}[1]{\texttt{\char92 #1}}
  36 \newcommand{\UWOBO}{UWOBO}
  37 \newcommand{\WHELP}{Whelp}
  38
  39 \definecolor{gray}{gray}{0.85} % 1 -> white; 0 -> black
  40 \newcommand{\NT}[1]{\langle\mathit{#1}\rangle}
  41 \newcommand{\URI}[1]{\texttt{#1}}
  42
  43 %{\end{SaveVerbatim}\setlength{\fboxrule}{.5mm}\setlength{\fboxsep}{2mm}%
  44 \newenvironment{grafite}{\VerbatimEnvironment
  45  \begin{SaveVerbatim}{boxtmp}}%
  46  {\end{SaveVerbatim}\setlength{\fboxsep}{3mm}%
  47   \begin{center}
  48    \fcolorbox{black}{gray}{\BUseVerbatim[boxwidth=0.9\linewidth]{boxtmp}}
  49   \end{center}}
  50
  51 \newcommand{\ASSIGNEDTO}[1]{\textbf{Assigned to:} #1}
  52 \newcommand{\FILE}[1]{\texttt{#1}}
  53 \newcommand{\NOTE}[1]{\marginpar{\scriptsize #1}}
  54 \newcommand{\TODO}[1]{\textbf{TODO: #1}}
  55
  56 \title{The Matita proof assistant}
  57 \author{Andrea Asperti, Claudio Sacerdoti Coen, Enrico Tassi
  58  and Stefano Zacchiroli}
  59 \institute{Department of Computer Science, University of Bologna\\
  60  Mura Anteo Zamboni, 7 --- 40127 Bologna, ITALY\\
  61  \email{$\{$asperti,sacerdot,tassi,zacchiro$\}$@cs.unibo.it}}
  62 \bibliographystyle{plain}
  63
  64 \begin{document}
  65 \maketitle
  66
  67 \begin{abstract}
  68 \end{abstract}
  69
  70 \section{Introduction}
  71 \label{sec:intro}
  72 {\em Matita} is the proof assistant under development by the \HELM{} team
  73 \cite{mkm-helm} at the University of Bologna, under the direction of
  74 Prof.~Asperti.
  75 The origin of the system goes back to 1999. At the time we were mostly
  76 interested to develop tools and techniques to enhance the accessibility
  77 via web of formal libraries of mathematics. Due to its dimension, the
  78 library of the \COQ{} proof assistant (of the order of 35'000 theorems)
  79 was choosed as a privileged test bench for our work, although experiments
  80 have been also conducted with other systems, and notably with \NUPRL{}.
  81 The work, mostly performed in the framework of the recently concluded
  82 European project IST-33562 \MOWGLI{}~\cite{pechino}, mainly consisted in the
  83 following teps:
  84 \begin{itemize}
  85 \item exporting the information from the internal representation of
  86  \COQ{} to a system and platform independent format. Since XML was at the
  87 time an emerging standard, we naturally adopted this technology, fostering
  88 a content-based architecture for future system, where the documents
  89 of the library were the the main components around which everything else
  90 has to be build;
  91 \item developing indexing and searching techniques supporting semantic
  92  queries to the library; these efforts gave birth to our \WHELP{}
  93 search engine, described in~\cite{whelp};
  94 \item developing languages and tools for a high-quality notational
  95 rendering of mathematical information; in particular, we have been
  96 active in the MathML Working group since 1999, and developed inside
  97 \HELM{} a MathML-compliant widget for the GTK graphical environment
  98 which can be integrated in any application.
  99 \end{itemize}
 100 The exportation issue, extensively discussed in \cite{exportation-module},
 101 has several major implications worth to be discussed.
 102
 103 The first
 104 point concerns the kind of content information to be exported. In a
 105 proof assistant like \COQ{}, proofs are represented in at least three clearly
 106 distinguishable formats: \emph{scripts} (i.e. sequences of commands issued by the
 107 user to the system during an interactive session of proof), \emph{proof objects}
 108 (which is the low-level representation of proofs in the form of
 109 lambda-terms readable to and checked by kernel) and \emph{proof-trees} (which
 110 is a kind of intermediate representation, vaguely inspired by a sequent
 111 like notation, that inherits most of the defects but essentially
 112 none of the advantages of the previous representations).
 113 Partially related to this problem, there is the
 114 issue of the {\em granularity} of the library: scripts usually comprise
 115 small developments with many definitions and theorems, while
 116 proof objects correspond to individual mathemtical items.
 117
 118 In our case, the choice of the content encoding was eventually dictated
 119 by the methodological assumption of offering the information in a
 120 stable and system-independent format. The language of scripts is too
 121 oriented to \COQ, and it changes too rapidly to be of any interest
 122 to third parties. On the other side, the language of proof objects
 123 merely depend on
 124 the logical framework (the Calculus of Inductive Constructions, in
 125 the case of \COQ), is grammatically simple, semantically clear and,
 126 especially, is very stable (as kernels of proof assistants
 127 often are).
 128 So the granularity of the library is at the level of individual
 129 objects, that also justifies from another point of view the need
 130 for efficient searching techniques for retrieving individual
 131 logical items from the repository.
 132
 133 The main (possibly only) problem with proof objects is that they are
 134 difficult to read and do not directly correspond to what the user typed
 135 in. An analogy frequently made in the proof assistant community is that of
 136 comparing the vernacular language of scripts to a high level source language
 137 and lambda terms to the assembly language they are compiled in. We do not
 138 share this view and prefer to look at scripts as an imperative language,
 139 and to lambda terms as their denotational semantics; still, however,
 140 denotational semantics is possibly more formal but surely not more readable
 141 than the imperative source.
 142
 143 For all the previous reasons, a huge amount of work inside \MOWGLI{} has
 144 been devoted to automatic reconstruction of proofs in natural language
 145 from lambda terms. Since lambda terms are in close connection
 146 with natural deduction
 147 (that is still the most natural logical language discovered so far)
 148 the work is not hopeless as it may seem, especially if rendering
 149 is combined, as in our case, with dynamic features supporting
 150 in-line expansions or contractions of subproofs. The final
 151 rendering is probably not entirely satisfactory (see \cite{ida} for a
 152 discussion), but surely
 153 readable (the actual quality largely depends by the way the lambda
 154 term is written).
 155
 156 Summing up, we already disposed of the following tools/techniques:
 157 \begin{itemize}
 158 \item XML specifications for the Calculus of Inductive Constructions,
 159 with tools for parsing and saving mathematical objects in such a format;
 160 \item metadata specifications and tools for indexing and querying the
 161 XML knowledge base;
 162 \item a proof checker (i.e. the {\em kernel} of a proof assistant),
 163  implemented to check that we exported form the \COQ{} library all the
 164 logically relevant content;
 165 \item a sophisticated parser (used by the search engine), able to deal
 166 with potentially ambiguous and incomplete information, typical of the
 167 mathematical notation \cite{};
 168 \item a {\em refiner}, i.e. a type inference system, based on complex
 169 existential variables, used by the disambiguating parser;
 170 \item complex transformation algorithms for proof rendering in natural
 171 language;
 172 \item an innovative rendering widget, supporting high-quality bidimensional
 173 rendering, and semantic selection, i.e. the possibility to select semantically
 174 meaningfull rendering expressions, and to past the respective content into
 175 a different text area.
 176 \NOTE{il widget\\ non ha sel\\ semantica}
 177 \end{itemize}
 178 Starting from all this, the further step of developing our own
 179 proof assistant was too
 180 small and too tempting to be neglected. Essentially, we ``just'' had to
 181 add an authoring interface, and a set of functionalities for the
 182 overall management of the library, integrating everything into a
 183 single system. \MATITA{} is the result of this effort.
 184
 185 At first sight, \MATITA{} looks as (and partly is) a \COQ{} clone. This is
 186 more the effect of the circumstances of its creation described
 187 above than the result of a deliberate design. In particular, we
 188 (essentially) share the same foundational dialect of \COQ{} (the
 189 Calculus of Inductive Constructions), the same implementative
 190 language (\OCAML{}), and the same (script based) authoring philosophy.
 191 However, as we shall see, the analogy essentially stops here.
 192
 193 In a sense; we like to think of \MATITA{} as the way \COQ{} would
 194 look like if entirely rerwritten from scratch: just to give an
 195 idea, although \MATITA{} currently supports almost all functionalities of
 196 \COQ{}, it links 60'000 lins of \OCAML{} code, against ... of \COQ{} (and
 197 we are convinced that, starting from scratch againg, we could furtherly
 198 reduce our code in sensible way).\NOTE{righe\\\COQ{}}
 199
 200 \begin{itemize}
 201  \item scelta del sistema fondazionale
 202  \item sistema indipendente (da Coq)
 203   \begin{itemize}
 204    \item possibilit\`a di sperimentare (soluzioni architetturali, logiche,
 205     implementative, \dots)
 206    \item compatibilit\`a con sistemi legacy
 207   \end{itemize}
 208 \end{itemize}
 209
 210 \section{Features}
 211
 212 \subsection{mathml}
 213 \ASSIGNEDTO{zack}
 214
 215 \subsection{metavariabili}
 216 \label{sec:metavariables}
 217 \ASSIGNEDTO{csc}
 218
 219 \subsection{pattern}
 220 \ASSIGNEDTO{gares}
 221
 222 \subsection{tatticali}
 223 \ASSIGNEDTO{gares}
 224
 225 \subsection{Disambiguation}
 226 \label{sec:disambiguation}
 227 \ASSIGNEDTO{zack}
 228
 229 \begin{table}
 230  \caption{\label{tab:termsyn} Concrete syntax of CIC terms: built-in
 231  notation\strut}
 232 \hrule
 233 \[
 234 \begin{array}{@{}rcll@{}}
 235   \NT{term} & ::= & & \mbox{\bf terms} \\
 236     &     & x & \mbox{(identifier)} \\
 237     &  |  & n & \mbox{(number)} \\
 238     &  |  & s & \mbox{(symbol)} \\
 239     &  |  & \mathrm{URI} & \mbox{(URI)} \\
 240     &  |  & \verb+_+ & \mbox{(implicit)}\TODO{sync} \\
 241     &  |  & \verb+?+n~[\verb+[+~\{\NT{subst}\}~\verb+]+] & \mbox{(meta)} \\
 242     &  |  & \verb+let+~\NT{ptname}~\verb+\def+~\NT{term}~\verb+in+~\NT{term} \\
 243     &  |  & \verb+let+~\NT{kind}~\NT{defs}~\verb+in+~\NT{term} \\
 244     &  |  & \NT{binder}~\{\NT{ptnames}\}^{+}~\verb+.+~\NT{term} \\
 245     &  |  & \NT{term}~\NT{term} & \mbox{(application)} \\
 246     &  |  & \verb+Prop+ \mid \verb+Set+ \mid \verb+Type+ \mid \verb+CProp+ & \mbox{(sort)} \\
 247     &  |  & \verb+match+~\NT{term}~ & \mbox{(pattern matching)} \\
 248     &     & ~ ~ [\verb+[+~\verb+in+~x~\verb+]+]
 249               ~ [\verb+[+~\verb+return+~\NT{term}~\verb+]+] \\
 250     &     & ~ ~ \verb+with [+~[\NT{rule}~\{\verb+|+~\NT{rule}\}]~\verb+]+ & \\
 251     &  |  & \verb+(+~\NT{term}~\verb+:+~\NT{term}~\verb+)+ & \mbox{(cast)} \\
 252     &  |  & \verb+(+~\NT{term}~\verb+)+ \\
 253   \NT{defs}  & ::= & & \mbox{\bf mutual definitions} \\
 254     &     & \NT{fun}~\{\verb+and+~\NT{fun}\} \\
 255   \NT{fun} & ::= & & \mbox{\bf functions} \\
 256     &     & \NT{arg}~\{\NT{ptnames}\}^{+}~[\verb+on+~x]~\verb+\def+~\NT{term} \\
 257   \NT{binder} & ::= & & \mbox{\bf binders} \\
 258     &     & \verb+\forall+ \mid \verb+\lambda+ \\
 259   \NT{arg} & ::= & & \mbox{\bf single argument} \\
 260     &     & \verb+_+ \mid x \\
 261   \NT{ptname} & ::= & & \mbox{\bf possibly typed name} \\
 262     &     & \NT{arg} \\
 263     &  |  & \verb+(+~\NT{arg}~\verb+:+~\NT{term}~\verb+)+ \\
 264   \NT{ptnames} & ::= & & \mbox{\bf bound variables} \\
 265     &     & \NT{arg} \\
 266     &  |  & \verb+(+~\NT{arg}~\{\verb+,+~\NT{arg}\}~[\verb+:+~\NT{term}]~\verb+)+ \\
 267   \NT{kind} & ::= & & \mbox{\bf induction kind} \\
 268     &     & \verb+rec+ \mid \verb+corec+ \\
 269   \NT{rule} & ::= & & \mbox{\bf rules} \\
 270     &     & x~\{\NT{ptname}\}~\verb+\Rightarrow+~\NT{term}
 271 \end{array}
 272 \]
 273 \hrule
 274 \end{table}
 275
 276 \subsubsection{Term input}
 277
 278 The primary form of user interaction employed by \MATITA{} is textual script
 279 editing: the user modifies it and evaluate step by step its composing
 280 \emph{statements}. Examples of statements are inductive type definitions,
 281 theorem declarations, LCF-style tacticals, and macros (e.g. \texttt{Check} can
 282 be used to ask the system to refine a given term and pretty print the result).
 283 Since many statements refer to terms of the underlying calculus, \MATITA{} needs
 284 a concrete syntax able to encode terms of the Calculus of Inductive
 285 Constructions.
 286
 287 Two of the requirements in the design of such a syntax are apparently in
 288 contrast:
 289 \begin{enumerate}
 290  \item the syntax should be as close as possible to common mathematical practice
 291   and implement widespread mathematical notations;
 292  \item each term described by the syntax should be non-ambiguous meaning that it
 293   should exists a function which associates to it a CIC term.
 294 \end{enumerate}
 295
 296 These two requirements are addressed in \MATITA{} by the mean of two mechanisms
 297 which work together: \emph{term disambiguation} and \emph{extensible notation}.
 298 Their interaction is visible in the architecture of the \MATITA{} input phase,
 299 depicted in Fig.~\ref{fig:inputphase}. The architecture is articulated as a
 300 pipline of three levels: the concrete syntax level (level 0) is the one the user
 301 has to deal with when inserting CIC terms; the abstract syntax level (level 2)
 302 is an internal representation which intuitively encodes mathematical formulae at
 303 the content level~\cite{adams}\cite{mkm-structure}; the last level is that of
 304 CIC terms.
 305
 306 \begin{figure}[ht]
 307  \begin{center}
 308   \includegraphics[width=0.9\textwidth]{input_phase}
 309   \caption{\MATITA{} input phase}
 310  \end{center}
 311  \label{fig:inputphase}
 312 \end{figure}
 313
 314 Requirement (1) is addressed by a built-in concrete syntax for terms, described
 315 in Tab.~\ref{tab:termsyn}, and the extensible notation mechanisms which offers a
 316 way for extending available mathematical notations. Extensible notation, which
 317 is also in charge of providing a parsing function mapping concrete syntax terms
 318 to content level terms, is described in Sect.~\ref{sec:notation}.  Requirement
 319 (2) is addressed by the conjunct action of that parsing function and
 320 disambiguation which provides a function from content level terms to CIC terms.
 321
 322 \subsubsection{Sources of ambiguity}
 323
 324 The translation from content level terms to CIC terms is not straightforward
 325 because some nodes of the content encoding admit more that one CIC encoding,
 326 invalidating requirement (2).
 327
 328 \begin{example}
 329  \label{ex:disambiguation}
 330
 331  Consider the term at the concrete syntax level \texttt{\TEXMACRO{forall} x. x +
 332  ln 1 = x} of Fig.~\ref{fig:inputphase}(a), it can be the type of a lemma the
 333  user may want to prove. Assuming that both \texttt{+} and \texttt{=} are parsed
 334  as infix operators, all the following questions are legitimate and must be
 335  answered before obtaining a CIC term from its content level encoding
 336  (Fig.~\ref{fig:inputphase}(b)):
 337
 338  \begin{enumerate}
 339
 340   \item Since \texttt{ln} is an unbound identifier, which CIC constants does it
 341    represent? Many different theorems in the library may share its (rather
 342    short) name \dots
 343
 344   \item Which kind of number (\IN, \IR, \dots) the \texttt{1} literal stand for?
 345    Which encoding is used in CIC to represent it? E.g., assuming $1\in\IN$, is
 346    it an unary or a binary encoding?
 347
 348   \item Which kind of equality the ``='' node represents? Is it Leibniz's
 349    polymorhpic equality? Is it a decidable equality over \IN, \IR, \dots?
 350
 351  \end{enumerate}
 352
 353 \end{example}
 354
 355 In \MATITA, three \emph{sources of ambiguity} are admitted for content level
 356 terms: unbound identifiers, literal numbers, and operators. Each instance of
 357 ambiguity sources (ambiguous entity) occuring in a content level term is
 358 associated to a \emph{disambiguation domain}. Intuitively a disambiguation
 359 domain is a set of CIC terms which may be replaced for an ambiguous entity
 360 during disambiguation. Each item of the domain is said to be an
 361 \emph{interpretation} for the ambiguous entity.
 362
 363 \emph{Unbound identifiers} (question 1) are ambiguous entities since the
 364 namespace of CIC objects is not flat and the same identifier may denote many
 365 ofthem. For example the short name \texttt{plus\_assoc} in the \HELM{} library
 366 is shared by three different theorems stating the associative property of
 367 different additions.  This kind of ambiguity is avoidable if the user is willing
 368 to use long names (in form of URIs in the \texttt{cic://} scheme) in the
 369 concrete syntax, with the obvious drawbacks of obtaining long and unreadable
 370 terms.
 371
 372 Given an unbound identifier, the corresponding disambiguation domain is computed
 373 querying the library for all constants, inductive types, and inductive type
 374 constructors having it as their short name (see the \LOCATE{} query in
 375 Sect.~\ref{sec:metadata}).
 376
 377 \emph{Literal numbers} (question 2) are ambiguous entities as well since
 378 different kinds of numbers can be encoded in CIC (\IN, \IR, \IZ, \dots) using
 379 different encodings. Considering the restricted example of natural numbers we
 380 can for instance encode them in CIC using inductive datatypes with a number of
 381 constructor equal to the encoding base plus 1, obtaining one encoding for each
 382 base.
 383
 384 For each possible way of mapping a literal number to a CIC term, \MATITA{} is
 385 aware of a \emph{number intepretation function} which, when applied to the
 386 natural number denoted by the literal\footnote{at the moment only literal
 387 natural number are supported in the concrete syntax} returns a corresponding CIC
 388 term. The disambiguation domain for a given literal number is built applying to
 389 the literal all available number interpretation functions in turn.
 390
 391 Number interpretation functions can be defined in OCaml or directly using
 392 \TODO{notazione per i numeri}.
 393
 394 \emph{Operators} (question 3) are intuitively head of applications, as such they
 395 are always applied to a non empty sequence of arguments. Their ambiguity is a
 396 need since it is often the case that some notation is used in an overloaded
 397 fashion to hide the use of different CIC constants which encodes similar
 398 concepts. For example, in the standard library of \MATITA{} the infix \texttt{+}
 399 notation is available building a binary \texttt{Op(+)} node, whose
 400 disambiguation domain may refer to different constants like the addition over
 401 natural numbers \URI{cic:/matita/nat/plus/plus.con} or that over real numbers of
 402 the \COQ{} standard library \URI{cic:/Coq/Reals/Rdefinitions/Rplus.con}.
 403
 404 For each possible way of mapping an operator application to a CIC term,
 405 \MATITA{} knows an \emph{operator interpretation function} which, when applied
 406 to an operator and its arguments, returns a CIC term. The disambiguation domain
 407 for a given operator is built applying to the operator and its arguments all
 408 available operator interpretation functions in turn.
 409
 410 Operator interpretation functions could be added using the
 411 \texttt{interpretation} statement. For example, among the first line of the
 412 script \FILE{matita/library/logic/equality.ma} from the \MATITA{} standard
 413 library we read:
 414
 415 \begin{grafite}
 416 interpretation "leibnitz's equality"
 417  'eq x y =
 418    (cic:/matita/logic/equality/eq.ind#xpointer(1/1) _ x y).
 419 \end{grafite}
 420
 421 Evaluating it in \MATITA{} will add an operator interpretation function for the
 422 binary operator \texttt{eq} which expands to the CIC term on the right hand side
 423 of the statement. That CIC term can be written using only built-in concrete
 424 syntax, can contain no ambiguity source; still, it can refer to operator
 425 arguments bound on the left hand side and can contain implicit terms (denoted
 426 with \texttt{\_}) which will be expanded to fresh metavariables. The latter
 427 feature is used in the example above for the first argument of Leibniz's
 428 polymorhpic equality.
 429
 430 \subsubsection{Disambiguation algorithm}
 431
 432 \NOTE{assumo\\
 433       che si sia\\
 434       gia' parlato\\
 435       di refine}
 436
 437
 438 A \emph{disambiguation algorithm} takes as input a content level term and return
 439 a fully determined CIC term. The key observation on which a disambiguation
 440 algorithm is based is that given a content level term with more than one sources
 441 of ambiguity, not all possible combination of interpretation lead to a typable
 442 CIC term. In the term of Ex.~\ref{ex:disambiguation} for instance the
 443 interpretation of \texttt{ln} as a function from \IR to \IR and the
 444 interpretation of \texttt{1} as the Peano number $1$ can't coexists. The notion
 445 of ``can't coexists'' in the disambiguation of \MATITA{} is inherited from the
 446 refiner described in Sect.~\ref{sec:metavariables}: as long as
 447 $\mathit{refine}(c)\neq\bot$, the combination of interpretation which led to $c$
 448 can coexists.
 449
 450 The \emph{naive disambiguation algorithm} takes as input a content level term
 451 $t$ and proceeds as follows:
 452
 453 \begin{enumerate}
 454
 455  \item Create disambiguation domains $\{D_i | i\in\mathit{Dom}(t)\}$, where
 456   $\mathit{Dom}(t)$ is the set of ambiguity sources of $t$. Each $D_i$ is a set
 457   of CIC terms and can be built as described above.
 458
 459  \item Let $\Phi = \{\phi_i | {i\in\mathit{Dom}(t)},\phi_i\in D_i\}$ be an
 460   interpretation for $t$. Given $t$ and an interpretation $\Phi$, a CIC term is
 461   fully determined. Iterate over all possible interpretations of $t$ and refine
 462   the corresponding CIC terms, keep only interpretations which lead to CIC terms
 463   $c$ s.t. $\mathit{refine}(c)\neq\bot$ (i.e. interpretations that determine
 464   typable terms).
 465
 466  \item Let $n$ be the number of interpretations who survived step 2. If $n=0$
 467   signal a type error. If $n=1$ we have found exactly one CIC term corresponding
 468   to $t$, returns it as output of the disambiguation phase. If $n>1$ we have
 469   found many different CIC terms which can correspond to the content level term,
 470   let the user choose one of the $n$ interpretations and returns the
 471   corresponding term.
 472
 473 \end{enumerate}
 474
 475 The above algorithm is highly inefficient since the number of possible
 476 interpretations $\Phi$ grows exponentially with the number of ambiguity sources.
 477 The actual algorithm used in \MATITA{} is far more efficient being, in the
 478 average case, linear in the number of ambiguity sources.
 479
 480 \TODO{FINQUI}
 481
 482 The efficient algorithm can be applied if the logic can be extended with
 483 metavariables and a refiner can be implemented. This is the case for CIC and
 484 several other logics.
 485 \emph{Metavariables}~\cite{munoz} are typed, non linear placeholders that can
 486 occur in terms; $?_i$ usually denotes the $i$-th metavariable, while $?$ denotes
 487 a freshly created metavariable. A \emph{refiner}~\cite{McBride} is a
 488 function whose input is a term with placeholders and whose output is either a
 489 new term obtained instantiating some placeholder or $\epsilon$, meaning that no
 490 well typed instantiation could be found for the placeholders occurring in
 491 the term (type error).
 492
 493 The efficient algorithm starts with an interpretation $\Phi_0 = \{\phi_i |
 494 \phi_i = ?, i\in\mathit{Dom}(t)\}$,
 495 which associates a fresh metavariable to each
 496 source of ambiguity. Then it iterates refining the current CIC term (i.e. the
 497 term obtained interpreting $t$ with $\Phi_i$). If the refinement succeeds the
 498 next interpretation $\Phi_{i+1}$ will be created \emph{making a choice}, that is
 499 replacing a placeholder with one of the possible choice from the corresponding
 500 disambiguation domain. The placeholder to be replaced is chosen following a
 501 preorder visit of the ambiguous term. If the refinement fails the current set of
 502 choices cannot lead to a well-typed term and backtracking is attempted.
 503 Once an unambiguous correct interpretation is found (i.e. $\Phi_i$ does no
 504 longer contain any placeholder), backtracking is attempted
 505 anyway to find the other correct interpretations.
 506
 507 The intuition which explain why this algorithm is more efficient is that as soon
 508 as a term containing placeholders is not typable, no further instantiation of
 509 its placeholders could lead to a typable term. For example, during the
 510 disambiguation of user input \texttt{\TEXMACRO{forall} x. x*0 = 0}, an
 511 interpretation $\Phi_i$ is encountered which associates $?$ to the instance
 512 of \texttt{0} on the right, the real number $0$ to the instance of \texttt{0} on
 513 the left, and the multiplication over natural numbers (\texttt{mult} for short)
 514 to \texttt{*}. The refiner will fail, since \texttt{mult} require a natural
 515 argument, and no further instantiation of the placeholder will be tried.
 516
 517 If, at the end of the disambiguation, more than one possible interpretations are
 518 possible, the user will be asked to choose the intended one (see
 519 Fig.~\ref{fig:disambiguation}).
 520
 521 \begin{figure}[htb]
 522 %   \centerline{\includegraphics[width=0.9\textwidth]{disambiguation-1}}
 523   \caption{\label{fig:disambiguation} Disambiguation: interpretation choice}
 524 \end{figure}
 525
 526 Details of the disambiguation algorithm of \WHELP{} can
 527 be found in~\cite{disambiguation}, where an equivalent algorithm
 528 that avoids backtracking is also presented.
 529
 530 \subsection{notazione}
 531 \label{sec:notation}
 532 \ASSIGNEDTO{zack}
 533
 534 \subsection{libreria tutta visibile}
 535 \ASSIGNEDTO{csc}
 536
 537 \subsection{ricerca e indicizzazione}
 538 \label{sec:metadata}
 539 \ASSIGNEDTO{andrea}
 540
 541 \subsection{auto}
 542 \ASSIGNEDTO{andrea}
 543
 544 \subsection{xml / gestione della libreria}
 545 \ASSIGNEDTO{gares}
 546
 547 \subsection{named variable}
 548 \ASSIGNEDTO{csc}
 549
 550 \subsection{assenza di proof tree / resa in linguaggio naturale}
 551 \ASSIGNEDTO{andrea}
 552
 553 \subsection{selezione semantica, cut paste, hyperlink}
 554 \ASSIGNEDTO{zack}
 555
 556 \subsection{sostituzioni esplicite vs moduli}
 557 \ASSIGNEDTO{csc}
 558
 559 \section{Drawbacks, missing, \dots}
 560
 561 \subsection{moduli}
 562 \ASSIGNEDTO{}
 563
 564 \subsection{ltac}
 565 \ASSIGNEDTO{}
 566
 567 \subsection{estrazione}
 568 \ASSIGNEDTO{}
 569
 570 \subsection{localizzazione errori}
 571 \ASSIGNEDTO{}
 572
 573 \textbf{Acknowledgements}
 574 We would like to thank all the students that during the past
 575 five years collaborated in the \HELM{} project and contributed to
 576 the development of Matita, and in particular
 577 A.Griggio, F.Guidi, P. Di Lena, L.Padovani, I.Schena, M.Selmi,
 578 V.Tamburrelli.
 579
 580 \bibliography{matita}
 581
 582 \end{document}
 583