followed kluwer guidelines

[helm.git] / helm / papers / matita / matita.tex
diff --git a/helm/papers/matita/matita.tex b/helm/papers/matita/matita.tex

index 4a74110b4df53596967c8d3dba418adbf3c1ff49..b7b20dc2ecf468797982ad9673e0900e14998373 100644 (file)
--- a/helm/papers/matita/matita.tex
+++ b/helm/papers/matita/matita.tex
@@ -1,12 +1,14 @@
-\documentclass[a4paper]{llncs}
-\pagestyle{headings}
+\documentclass{kluwer}
  \usepackage{color}
  \usepackage{graphicx}
-\usepackage{amssymb,amsmath}
+% \usepackage{amssymb,amsmath}
  \usepackage{hyperref}
-\usepackage{picins}
+% \usepackage{picins}
+\usepackage{color}
  \usepackage{fancyvrb}
+\usepackage[show]{ed}
  
+\definecolor{gray}{gray}{0.85}
  %\newcommand{\logo}[3]{
  %\parpic(0cm,0cm)(#2,#3)[l]{\includegraphics[width=#1]{whelp-bw}}
  %}
@@ -16,10 +18,10 @@
  \newcommand{\ELIM}{\textsc{Elim}}
  \newcommand{\HELM}{Helm}
  \newcommand{\HINT}{\textsc{Hint}}
-\newcommand{\IN}{\ensuremath{\mathbb{N}}}
+\newcommand{\IN}{\ensuremath{\dN}}
  \newcommand{\INSTANCE}{\textsc{Instance}}
-\newcommand{\IR}{\ensuremath{\mathbb{R}}}
-\newcommand{\IZ}{\ensuremath{\mathbb{Z}}}
+\newcommand{\IR}{\ensuremath{\dR}}
+\newcommand{\IZ}{\ensuremath{\dZ}}
  \newcommand{\LIBXSLT}{LibXSLT}
  \newcommand{\LOCATE}{\textsc{Locate}}
  \newcommand{\MATCH}{\textsc{Match}}
@@ -48,24 +50,60 @@
     \fcolorbox{black}{gray}{\BUseVerbatim[boxwidth=0.9\linewidth]{boxtmp}}
    \end{center}}
  
+\newcounter{example}
+\newenvironment{example}{\stepcounter{example}\emph{Example} \arabic{example}.}
+ {}
  \newcommand{\ASSIGNEDTO}[1]{\textbf{Assigned to:} #1}
-\newcommand{\NOTE}[1]{\marginpar{\scriptsize #1}}
+\newcommand{\FILE}[1]{\texttt{#1}}
+% \newcommand{\NOTE}[1]{\ifodd \arabic{page} \else \hspace{-2cm}\fi\ednote{#1}}
+\newcommand{\NOTE}[1]{\ednote{x~\hspace{-10cm}y#1}{foo}}
  \newcommand{\TODO}[1]{\textbf{TODO: #1}}
  
-\title{The Matita proof assistant}
-\author{Andrea Asperti, Claudio Sacerdoti Coen, Enrico Tassi
- and Stefano Zacchiroli}
-\institute{Department of Computer Science, University of Bologna\\
- Mura Anteo Zamboni, 7 --- 40127 Bologna, ITALY\\
- \email{$\{$asperti,sacerdot,tassi,zacchiro$\}$@cs.unibo.it}}
+\newsavebox{\tmpxyz}
+\newcommand{\sequent}[2]{
+  \savebox{\tmpxyz}[0.9\linewidth]{
+    \begin{minipage}{0.9\linewidth}
+      \ensuremath{#1} \\
+      \rule{3cm}{0.03cm}\\
+      \ensuremath{#2}
+    \end{minipage}}\setlength{\fboxsep}{3mm}%
+  \begin{center}
+   \fcolorbox{black}{gray}{\usebox{\tmpxyz}}
+  \end{center}}
+
  \bibliographystyle{plain}
  
  \begin{document}
-\maketitle
+
+\begin{opening}
+
+ \title{The \MATITA{} Proof Assistant}
+
+\author{Andrea \surname{Asperti} \email{asperti@cs.unibo.it}}
+\author{Claudio \surname{Sacerdoti Coen} \email{sacerdot@cs.unibo.it}}
+\author{Enrico \surname{Tassi} \email{tassi@cs.unibo.it}}
+\author{Stefano \surname{Zacchiroli} \email{zacchiro@cs.unibo.it}}
+\institute{Department of Computer Science, University of Bologna\\
+ Mura Anteo Zamboni, 7 --- 40127 Bologna, ITALY}
+
+\runningtitle{The Matita proof assistant}
+\runningauthor{Asperti, Sacerdoti Coen, Tassi, Zacchiroli}
+
+% \date{data}
+
+\begin{motto}
+ Be primitive.
+\end{motto}
  
  \begin{abstract}
+ abstract qui
  \end{abstract}
  
+\keywords{Proof Assistant, Mathematical Knowledge Management, XML, Authoring,
+Digital Libraries}
+
+\end{opening}
+
  \section{Introduction}
  \label{sec:intro}
  {\em Matita} is the proof assistant under development by the \HELM{} team
@@ -112,7 +150,7 @@ none of the advantages of the previous representations).
  Partially related to this problem, there is the
  issue of the {\em granularity} of the library: scripts usually comprise
  small developments with many definitions and theorems, while 
-proof objects correspond to individual mathemtical items. 
+proof objects correspond to individual mathematical items. 
  
  In our case, the choice of the content encoding was eventually dictated
  by the methodological assumption of offering the information in a
@@ -170,9 +208,9 @@ existential variables, used by the disambiguating parser;
  language;
  \item an innovative rendering widget, supporting high-quality bidimensional
  rendering, and semantic selection, i.e. the possibility to select semantically
-meaningfull rendering expressions, and to past the respective content into
+meaningful rendering expressions, and to past the respective content into
  a different text area.
-\NOTE{il widget\\ non ha sel\\ semantica}
+\NOTE{il widget non ha sel semantica}
  \end{itemize}
  Starting from all this, the further step of developing our own 
  proof assistant was too
@@ -190,11 +228,11 @@ language (\OCAML{}), and the same (script based) authoring philosophy.
  However, as we shall see, the analogy essentially stops here. 
  
  In a sense; we like to think of \MATITA{} as the way \COQ{} would 
-look like if entirely rerwritten from scratch: just to give an
+look like if entirely rewritten from scratch: just to give an
  idea, although \MATITA{} currently supports almost all functionalities of
  \COQ{}, it links 60'000 lins of \OCAML{} code, against ... of \COQ{} (and
-we are convinced that, starting from scratch againg, we could furtherly
-reduce our code in sensible way).\NOTE{righe\\\COQ{}}
+we are convinced that, starting from scratch again, we could furtherly
+reduce our code in sensible way).\NOTE{righe \COQ{}}
  
  \begin{itemize}
   \item scelta del sistema fondazionale
@@ -206,70 +244,114 @@ reduce our code in sensible way).\NOTE{righe\\\COQ{}}
    \end{itemize}
  \end{itemize}
  
-\section{Features}
+\section{\HELM{} library(??)}
  
-\subsection{mathml}
-\ASSIGNEDTO{zack}
+\subsection{libreria tutta visibile}
+\ASSIGNEDTO{csc}
+\NOTE{assumo che si sia gia' parlato di approccio content-centrico}
+Our commitment to the content-centric view of the architecture of the system
+has important consequences on the user's experience and on the functionalities
+of several components of \MATITA. In the content-centric view the library
+of mathematical knowledge is an already existent and completely autonomous
+entity that we are allowed to exploit and augment using \MATITA. Thus, in
+principle, when the user starts to prove a new theorem she has complete
+visibility of the library and she can refer to every definition and lemma,
+also using the mathematical notation already developed. In a similar way,
+every form of automation of the system must be able to analyze and possibly
+exploit every notion in the library.
+
+The benefits of this approach highly justify the non neglectable price to pay
+in the development of several components. We analyse now a few of the causes
+of this additional complexity.
+
+\subsubsection{Ambiguity}
+A rich mathematical library includes equivalent definitions and representations
+of the same notion. Moreover, mathematical notation inside a rich library is
+surely highly overloaded. As a consequence every mathematical expression the
+user provides is highly ambiguous since all the definitions,
+representations and special notations are available at once to the user.
+
+The usual solution to the problem, as adopted for instance in Coq, is to
+restrict the user's scope to just one interpretation for each definition,
+representation or notation. In this way much of the ambiguity is removed,
+burdening the user that must someway declare what is in scope and that must
+use special syntax when she needs to refer to something not in scope.
+
+Even with this approach ambiguity cannot be completely removed since implicit
+coercions can be arbitrarily inserted by the system to ``change the type''
+of subterms that do not have the expected type. Usually implicit coercions
+are used to overcome the absence of subtyping that should mimic the subset
+relation found in set theory. For instance, the expression
+$\forall n \in nat. 2 * n * \pi \equiv_\pi 0$ is correct in set theory since
+the set of natural numbers is a subset of that of real numbers; the
+corresponding expression $\forall n:nat. 2*n*\pi \equiv_\pi 0$ is not well typed
+and requires the automatic insertion of the coercion $real_of_nat: nat \to R$
+either around both 2 and $n$ (to make both products be on real numbers) or
+around the product $2*n$. The usual approach consists in either rejecting the
+ambiguous term or arbitrarily choosing one of the interpretations. For instance,
+Coq rejects the declaration of coercions that have alternatives
+(i.e. already declared coercions with the same domain and codomain)
+or that are obtained composing other coercions in order to
+avoid making several terms highly ambiguous by choosing to insert any one of the
+alternative coercions. Coq also arbitrarily chooses how to insert coercions in
+terms to make them well typed when there is more than one possibility (as in
+the previous example).
+
+The approach we are following is radically different. It consists in dealing
+with ambiguous expressions instead of avoiding them. As a last resource,
+when the system is unable to disambiguate the input, the user is interactively
+required to provide more information that is recorded to avoid asking the
+same question again in subsequent processing of the same input.
+More details on our approach can be found in \ref{sec:disambiguation}.
+
+\subsubsection{Consistency}
+A large mathematical library is likely to be logically inconsistent.
+It may contain incompatible axioms or alternative conjectures and it may
+even use definitions in incompatible ways. To clarify this last point,
+consider two identical definitions of a set of elements of a given type
+(or of a category of objects of a given type). Let us call the two definitions
+$A-Set$ and $B-Set$ (or $A-Category$ and $B-Category$).
+It is perfectly legitimate to either form the $A-Set$ of every $B-Set$
+or the $B-Set$ of every $A-Set$ (the same for categories). This just corresponds
+to assuming that a $B-Set$ (respectively an $A-Set$) is a small set, whereas
+an $A-Set$ (respectively a $B-Set$) is a big set (possibly of small sets).
+However, if one part of the library assumes $A-Set$s to be the small ones
+and another part of the library assumes $B-Set$s to be the small ones, the
+library as a whole will be logically inconsistent.
+
+Logical inconsistency has never been a problem in the daily work of a
+mathematician. The mathematician simply imposes himself a discipline to
+restrict himself to consistent subsets of the mathematical knowledge.
+However, in doing so he doesn't choose the subset in advance by forgetting
+the rest of his knowledge.
+
+Contrarily to a mathematician, the usual tendency in the world of assisted
+automation is that of restricting in advance the part of the library that
+will be used later on, checking its consistency by construction.
  
-\subsection{metavariabili}
+\subsection{ricerca e indicizzazione}
+\label{sec:metadata}
+\ASSIGNEDTO{andrea}
+
+\subsection{auto}
+\ASSIGNEDTO{andrea}
+
+\subsection{sostituzioni esplicite vs moduli}
  \ASSIGNEDTO{csc}
  
-\subsection{pattern}
+\subsection{xml / gestione della libreria}
  \ASSIGNEDTO{gares}
  
-\subsection{tatticali}
-\ASSIGNEDTO{gares}
+
+\section{User Interface (da cambiare)}
+
+\subsection{assenza di proof tree / resa in linguaggio naturale}
+\ASSIGNEDTO{andrea}
  
  \subsection{Disambiguation}
  \label{sec:disambiguation}
  \ASSIGNEDTO{zack}
  
-\begin{table}
- \caption{\label{tab:termsyn} Concrete syntax of CIC terms: built-in notation\strut}
-\hrule
-\[
-\begin{array}{@{}rcll@{}}
-  \NT{term} & ::= & & \mbox{\bf terms} \\
-    &     & x & \mbox{(identifier)} \\
-    &  |  & n & \mbox{(number)} \\
-    &  |  & s & \mbox{(symbol)} \\
-    &  |  & \mathrm{URI} & \mbox{(URI)} \\
-    &  |  & \verb+?+ & \mbox{(implicit)} \\
-    &  |  & \verb+?+n~[\verb+[+~\{\NT{subst}\}~\verb+]+] & \mbox{(meta)} \\
-    &  |  & \verb+let+~\NT{ptname}~\verb+\def+~\NT{term}~\verb+in+~\NT{term} \\
-    &  |  & \verb+let+~\NT{kind}~\NT{defs}~\verb+in+~\NT{term} \\
-    &  |  & \NT{binder}~\{\NT{ptnames}\}^{+}~\verb+.+~\NT{term} \\
-    &  |  & \NT{term}~\NT{term} & \mbox{(application)} \\
-    &  |  & \verb+Prop+ \mid \verb+Set+ \mid \verb+Type+ \mid \verb+CProp+ & \mbox{(sort)} \\
-    &  |  & \verb+match+~\NT{term}~ & \mbox{(pattern matching)} \\
-    &     & ~ ~ [\verb+[+~\verb+in+~x~\verb+]+]
-             ~ [\verb+[+~\verb+return+~\NT{term}~\verb+]+] \\
-    &     & ~ ~ \verb+with [+~[\NT{rule}~\{\verb+|+~\NT{rule}\}]~\verb+]+ & \\
-    &  |  & \verb+(+~\NT{term}~\verb+:+~\NT{term}~\verb+)+ & \mbox{(cast)} \\
-    &  |  & \verb+(+~\NT{term}~\verb+)+ \\
-  \NT{defs}  & ::= & & \mbox{\bf mutual definitions} \\
-    &     & \NT{fun}~\{\verb+and+~\NT{fun}\} \\
-  \NT{fun} & ::= & & \mbox{\bf functions} \\
-    &     & \NT{arg}~\{\NT{ptnames}\}^{+}~[\verb+on+~x]~\verb+\def+~\NT{term} \\
-  \NT{binder} & ::= & & \mbox{\bf binders} \\
-    &     & \verb+\forall+ \mid \verb+\lambda+ \\
-  \NT{arg} & ::= & & \mbox{\bf single argument} \\
-    &     & \verb+_+ \mid x \\
-  \NT{ptname} & ::= & & \mbox{\bf possibly typed name} \\
-    &     & \NT{arg} \\
-    &  |  & \verb+(+~\NT{arg}~\verb+:+~\NT{term}~\verb+)+ \\
-  \NT{ptnames} & ::= & & \mbox{\bf bound variables} \\
-    &     & \NT{arg} \\
-    &  |  & \verb+(+~\NT{arg}~\{\verb+,+~\NT{arg}\}~[\verb+:+~\NT{term}]~\verb+)+ \\
-  \NT{kind} & ::= & & \mbox{\bf induction kind} \\
-    &     & \verb+rec+ \mid \verb+corec+ \\
-  \NT{rule} & ::= & & \mbox{\bf rules} \\
-    &     & x~\{\NT{ptname}\}~\verb+\Rightarrow+~\NT{term}
-\end{array}
-\]
-\hrule
-\end{table}
-
  \subsubsection{Term input}
  
  The primary form of user interaction employed by \MATITA{} is textual script
@@ -323,6 +405,7 @@ because some nodes of the content encoding admit more that one CIC encoding,
  invalidating requirement (2).
  
  \begin{example}
+ \label{ex:disambiguation}
  
   Consider the term at the concrete syntax level \texttt{\TEXMACRO{forall} x. x +
   ln 1 = x} of Fig.~\ref{fig:inputphase}(a), it can be the type of a lemma the
@@ -388,8 +471,8 @@ Number interpretation functions can be defined in OCaml or directly using
  \TODO{notazione per i numeri}.
  
  \emph{Operators} (question 3) are intuitively head of applications, as such they
-are always applied to a non empty sequence of arguments. Their ambiguity is a
-need since it is often the case that some notation is used in an overloaded
+are always applied to a (possiblt empty) sequence of arguments. Their ambiguity
+is a need since it is often the case that some notation is used in an overloaded
  fashion to hide the use of different CIC constants which encodes similar
  concepts. For example, in the standard library of \MATITA{} the infix \texttt{+}
  notation is available building a binary \texttt{Op(+)} node, whose
@@ -397,57 +480,85 @@ disambiguation domain may refer to different constants like the addition over
  natural numbers \URI{cic:/matita/nat/plus/plus.con} or that over real numbers of
  the \COQ{} standard library \URI{cic:/Coq/Reals/Rdefinitions/Rplus.con}.
  
-For each possible way of mapping a symbol application to a CIC term, \MATITA{}
-knows a \emph{symbol interpretation function} which, when applied to a symbol
-and its arguments, returns a CIC term. The disambiguation domain for a given
-operator is built applying to the symbol and its arguments all available symbol
-interpretation functions in turn.
+For each possible way of mapping an operator application to a CIC term,
+\MATITA{} knows an \emph{operator interpretation function} which, when applied
+to an operator and its arguments, returns a CIC term. The disambiguation domain
+for a given operator is built applying to the operator and its arguments all
+available operator interpretation functions in turn.
+
+Operator interpretation functions could be added using the
+\texttt{interpretation} statement. For example, among the first line of the
+script \FILE{matita/library/logic/equality.ma} from the \MATITA{} standard
+library we read:
  
  \begin{grafite}
- foo
- bar
- baz
+interpretation "leibnitz's equality"
+ 'eq x y =
+   (cic:/matita/logic/equality/eq.ind#xpointer(1/1) _ x y).
  \end{grafite}
  
-\TODO{FINQUI, il resto \`e copy and paste dal Whelp paper \dots}
-
-Note that given a content level term with more than one sources of ambiguity,
-not all possible disambiguation choices are valid: for example, given the input
-\texttt{1+1} we must choose an interpretation of \texttt{+} which is typable in
-CIC according to the chosen interpretation for \texttt{1}; choosing as
-\texttt{+} the addition over natural numbers and as \texttt{1} the real number
-$1$ will lead to a type error.
-
-A \emph{disambiguation algorithm} takes as input an ambiguous term and return a
-fully determined CIC term. The \emph{naive disambiguation algorithm} takes as
-input an ambiguous term $t$ and proceeds as follows:
+Evaluating it in \MATITA{} will add an operator interpretation function for the
+binary operator \texttt{eq} which expands to the CIC term on the right hand side
+of the statement. That CIC term can be written using only built-in concrete
+syntax, can contain no ambiguity source; still, it can refer to operator
+arguments bound on the left hand side and can contain implicit terms (denoted
+with \texttt{\_}) which will be expanded to fresh metavariables. The latter
+feature is used in the example above for the first argument of Leibniz's
+polymorhpic equality.
+
+\subsubsection{Disambiguation algorithm}
+
+\NOTE{assumo che si sia gia' parlato di refine}
+
+A \emph{disambiguation algorithm} takes as input a content level term and return
+a fully determined CIC term. The key observation on which a disambiguation
+algorithm is based is that given a content level term with more than one sources
+of ambiguity, not all possible combination of interpretation lead to a typable
+CIC term. In the term of Ex.~\ref{ex:disambiguation} for instance the
+interpretation of \texttt{ln} as a function from \IR to \IR and the
+interpretation of \texttt{1} as the Peano number $1$ can't coexists. The notion
+of ``can't coexists'' in the disambiguation of \MATITA{} is inherited from the
+refiner described in Sect.~\ref{sec:metavariables}: as long as
+$\mathit{refine}(c)\neq\epsilon$, the combination of interpretation which led to
+$c$
+can coexists.
+
+The \emph{naive disambiguation algorithm} takes as input a content level term
+$t$ and proceeds as follows:
  
  \begin{enumerate}
  
   \item Create disambiguation domains $\{D_i | i\in\mathit{Dom}(t)\}$, where
    $\mathit{Dom}(t)$ is the set of ambiguity sources of $t$. Each $D_i$ is a set
-  of CIC terms.
+  of CIC terms and can be built as described above.
  
- \item Let $\Phi = \{\phi_i | {i\in\mathit{Dom}(t)},\phi_i\in D_i\}$
-%  such that $\forall i\in\mathit{Dom}(t),\exists\phi_j\in\Phi,i=j$
-  be an interpretation for $t$. Given $t$ and an interpretation $\Phi$, a CIC
-  term is fully determined. Iterate over all possible interpretations of $t$ and
-  type-check them, keep only typable interpretations (i.e. interpretations that
-  determine typable terms).
+ \item Let $\Phi = \{\phi_i | {i\in\mathit{Dom}(t)},\phi_i\in D_i\}$ be an
+  interpretation for $t$. Given $t$ and an interpretation $\Phi$, a CIC term is
+  fully determined. Iterate over all possible interpretations of $t$ and refine
+  the corresponding CIC terms, keep only interpretations which lead to CIC terms
+  $c$ s.t. $\mathit{refine}(c)\neq\epsilon$ (i.e. interpretations that determine
+  typable terms).
  
   \item Let $n$ be the number of interpretations who survived step 2. If $n=0$
-  signal a type error. If $n=1$ we have found exactly one CIC term
-  corresponding to $t$, returns it as output of the disambiguation phase.
-  If $n>1$ let the user choose one of the $n$ interpretations and returns the
+  signal a type error. If $n=1$ we have found exactly one CIC term corresponding
+  to $t$, returns it as output of the disambiguation phase. If $n>1$ we have
+  found many different CIC terms which can correspond to the content level term,
+  let the user choose one of the $n$ interpretations and returns the
    corresponding term.
  
  \end{enumerate}
  
  The above algorithm is highly inefficient since the number of possible
  interpretations $\Phi$ grows exponentially with the number of ambiguity sources.
-The actual algorithm used in \WHELP{} is far more efficient being, in the
+The actual algorithm used in \MATITA{} is far more efficient being, in the
  average case, linear in the number of ambiguity sources.
  
+The efficient algorithm --- thoroughly described along with an analysis of its
+complexity in~\cite{disambiguation} --- exploit the refiner and the metavariable
+extension (Sect.~\ref{sec:metavariables}) of the calculus used in \MATITA.
+
+\TODO{FINQUI}
+
  The efficient algorithm can be applied if the logic can be extended with
  metavariables and a refiner can be implemented. This is the case for CIC and
  several other logics.
@@ -496,34 +607,249 @@ Details of the disambiguation algorithm of \WHELP{} can
  be found in~\cite{disambiguation}, where an equivalent algorithm
  that avoids backtracking is also presented.
  
+
  \subsection{notazione}
  \label{sec:notation}
  \ASSIGNEDTO{zack}
  
-\subsection{libreria tutta visibile}
-\ASSIGNEDTO{csc}
+\subsection{mathml}
+\ASSIGNEDTO{zack}
  
-\subsection{ricerca e indicizzazione}
-\label{sec:metadata}
-\ASSIGNEDTO{andrea}
+\subsection{selezione semantica, cut paste, hyperlink}
+\ASSIGNEDTO{zack}
  
-\subsection{auto}
-\ASSIGNEDTO{andrea}
+\subsection{pattern}
+\ASSIGNEDTO{gares}\\
+Patterns are the textual counterpart of the MathML widget graphical
+selection.
+
+Matita benefits of a graphical interface and a powerful MathML rendering
+widget that allows the user to select pieces of the sequent he is working
+on. While this is an extremely intuitive way for the user to
+restrict the application of tactics, for example, to some subterms of the
+conclusion or some hypothesis, the way this action is recorded to the text
+script is not obvious.\\
+In \MATITA{} this issue is addressed by patterns.
+
+\subsubsection{Pattern syntax}
+A pattern is composed of two terms: a $\NT{sequent\_path}$ and a
+$\NT{wanted}$.
+The former mocks-up a sequent, discharging unwanted subterms with $?$ and
+selecting the interesting parts with the placeholder $\%$. 
+The latter is a term that lives in the context of the placeholders.
+
+The concrete syntax is reported in table \ref{tab:pathsyn}
+\NOTE{uso nomi diversi dalla grammatica ma che hanno + senso}
+\begin{table}
+ \caption{\label{tab:pathsyn} Concrete syntax of \MATITA{} patterns.\strut}
+\hrule
+\[
+\begin{array}{@{}rcll@{}}
+  \NT{pattern} & 
+    ::= & [~\verb+in match+~\NT{wanted}~]~[~\verb+in+~\NT{sequent\_path}~] & \\
+  \NT{sequent\_path} & 
+    ::= & \{~\NT{ident}~[~\verb+:+~\NT{multipath}~]~\}~
+      [~\verb+\vdash+~\NT{multipath}~] & \\
+  \NT{wanted} & ::= & \NT{term} & \\
+  \NT{multipath} & ::= & \NT{term\_with\_placeholders} & \\
+\end{array}
+\]
+\hrule
+\end{table}
  
-\subsection{xml / gestione della libreria}
+\subsubsection{How patterns work}
+Patterns mimic the user's selection in two steps. The first one
+selects roots (subterms) of the sequent, using the
+$\NT{sequent\_path}$,  while the second 
+one searches the $\NT{wanted}$ term starting from these roots. Both are
+optional steps, and by convention the empty pattern selects the whole
+conclusion.
+
+\begin{description}
+\item[Phase 1]
+  concerns only the $[~\verb+in+~\NT{sequent\_path}~]$
+  part of the syntax. $\NT{ident}$ is an hypothesis name and
+  selects the assumption where the following optional $\NT{multipath}$
+  will operate. \verb+\vdash+ can be considered the name for the goal.
+  If the whole pattern is omitted, the whole goal will be selected.
+  If one or more hypotheses names are given the selection is restricted to 
+  these assumptions. If a $\NT{multipath}$ is omitted the whole
+  assumption is selected. Remember that the user can be mostly
+  unaware of this syntax, since the system is able to write down a 
+  $\NT{sequent\_path}$ starting from a visual selection.
+  \NOTE{Questo ancora non va in matita}
+
+  A $\NT{multipath}$ is a CiC term in which a special constant $\%$
+  is allowed.
+  The roots of discharged subterms are marked with $?$, while $\%$
+  is used to select roots. The default $\NT{multipath}$, the one that
+  selects the whole term, is simply $\%$.
+  Valid $\NT{multipath}$ are, for example, $(?~\%~?)$ or $\%~\verb+\to+~(\%~?)$
+  that respectively select the first argument of an application or
+  the source of an arrow and the head of the application that is
+  found in the arrow target.
+
+  The first phase selects not only terms (roots of subterms) but also 
+  their context that will be eventually used in the second phase.
+
+\item[Phase 2] 
+  plays a role only if the $[~\verb+in match+~\NT{wanted}~]$
+  part is specified. From the first phase we have some terms, that we
+  will see as subterm roots, and their context. For each of these
+  contexts the $\NT{wanted}$ term is disambiguated in it and the
+  corresponding root is searched for a subterm $\alpha$-equivalent to
+  $\NT{wanted}$. The result of this search is the selection the
+  pattern represents.
+
+\end{description}
+
+\noindent
+Since the first step is equipotent to the composition of the two
+steps, the system uses it to represent each visual selection.
+The second step is only meant for the
+experienced user that writes patterns by hand, since it really
+helps in writing concise patterns as we will see in the
+following examples.
+
+\subsubsection{Examples}
+To explain how the first step works let's give an example. Consider
+you want to prove the uniqueness of the identity element $0$ for natural
+sum, and that you can relay on the previously demonstrated left
+injectivity of the sum, that is $inj\_plus\_l:\forall x,y,z.x+y=z+y \to x =z$.
+Typing
+\begin{grafite}
+theorem valid_name: \forall n,m. m + n = n \to m = O.
+  intros (n m H).
+\end{grafite}
+\noindent
+leads you to the following sequent 
+\sequent{
+n:nat\\
+m:nat\\
+H: m + n = n}{
+m=O
+}
+\noindent
+where you want to change the right part of the equivalence of the $H$
+hypothesis with $O + n$ and then use $inj\_plus\_l$ to prove $m=O$.
+\begin{grafite}
+  change in H:(? ? ? %) with (O + n).
+\end{grafite}
+\noindent
+This pattern, that is a simple instance of the $\NT{sequent\_path}$
+grammar entry, acts on $H$ that has type (without notation) $(eq~nat~(m+n)~n)$
+and discharges the head of the application and the first two arguments with a
+$?$ and selects the last argument with $\%$. The syntax may seem uncomfortable,
+but the user can simply select with the mouse the right part of the equivalence
+and left to the system the burden of writing down in the script file the
+corresponding pattern with $?$ and $\%$ in the right place (that is not
+trivial, expecially where implicit arguments are hidden by the notation, like
+the type $nat$ in this example).
+
+Changing all the occurrences of $n$ in the hypothesis $H$ with $O+n$ 
+works too and can be done, by the experienced user, writing directly
+a simpler pattern that uses the second phase.
+\begin{grafite}
+  change in match n in H with (O + n).
+\end{grafite}
+\noindent
+In this case the $\NT{sequent\_path}$ selects the whole $H$, while
+the second phase searches the wanted $n$ inside it by
+$\alpha$-equivalence. The resulting
+equivalence will be $m+(O+n)=O+n$ since the second phase found two
+occurrences of $n$ in $H$ and the tactic changed both.
+
+Just for completeness the second pattern is equivalent to the
+following one, that is less readable but uses only the first phase.
+\begin{grafite}
+  change in H:(? ? (? ? %) %) with (O + n).
+\end{grafite}
+\noindent
+
+\subsubsection{Tactics supporting patterns}
+In \MATITA{} all the tactics that can be restricted to subterm of the working
+sequent accept the pattern syntax. In particular these tactics are: simplify,
+change, fold, unfold, generalize, replace and rewrite.
+
+\NOTE{attualmente rewrite e fold non supportano phase 2. per
+supportarlo bisogna far loro trasformare il pattern phase1+phase2 
+in un pattern phase1only come faccio nell'ultimo esempio. lo si fa
+con una pattern\_of(select(pattern))}
+
+\subsubsection{Comparison with Coq}
+Coq has a two diffrent ways of restricting the application of tactis to
+subterms of the sequent, both relaying on the same special syntax to identify
+a term occurrence.
+
+The first way is to use this special syntax to specify directly to the
+tactic the occurrnces of a wanted term that should be affected, while
+the second is to prepare the sequent with another tactic called
+pattern and the apply the real tactic. Note that the choice is not
+left to the user, since some tactics needs the sequent to be prepared
+with pattern and do not accept directly this special syntax.
+
+The base idea is that to identify a subterm of the sequent we can
+write it and say that we want, for example, the third and the fifth
+occurce of it (counting from left to right). In our previous example,
+to change only the left part of the equivalence, the correct command
+is
+\begin{grafite}
+  change n at 2 in H with (O + n)
+\end{grafite} 
+\noindent
+meaning that in the hypothesis $H$ the $n$ we want to change is the
+second we encounter proceeding from left toright.
+
+The tactic pattern computes a
+$\beta$-expansion of a part of the sequent with respect to some
+occurrences of the given term. In the previous example the following
+command
+\begin{grafite}
+  pattern n at 2 in H
+\end{grafite}
+\noindent
+would have resulted in this sequent
+\begin{grafite}
+  n : nat
+  m : nat
+  H : (fun n0 : nat => m + n = n0) n
+  ============================
+   m = 0
+\end{grafite}
+\noindent
+where $H$ is $\beta$-expanded over the second $n$
+occurrence. This is a trick to make the unification algorithm ignore
+the head of the application (since the unification is essentially
+first-order) but normally operate on the arguments. 
+This works for some tactics, like rewrite and replace,
+but for example not for change and other tactics that do not relay on
+unification. 
+
+The idea behind this way of identifying subterms in not really far
+from the idea behind patterns, but really fails in extending to
+complex notation, since it relays on a mono-dimensional sequent representation.
+Real math notation places arguments upside-down (like in indexed sums or
+integrations) or even puts them inside a bidimensional matrix.  
+In these cases using the mouse to select the wanted term is probably the 
+only way to tell the system exactly what you want to do. 
+
+One of the goals of \MATITA{} is to use modern publishing techiques, and
+adopting a method for restricting tactics application domain that discourages 
+using heavy math notation, would definitively be a bad choice.
+
+\subsection{tatticali}
  \ASSIGNEDTO{gares}
  
-\subsection{named variable}
+\subsection{named variable e disambiguazione lazy}
  \ASSIGNEDTO{csc}
  
-\subsection{assenza di proof tree / resa in linguaggio naturale}
-\ASSIGNEDTO{andrea}
+\subsection{metavariabili}
+\label{sec:metavariables}
+\ASSIGNEDTO{csc}
  
-\subsection{selezione semantica, cut paste, hyperlink}
-\ASSIGNEDTO{zack}
+\begin{verbatim}
  
-\subsection{sostituzioni esplicite vs moduli}
-\ASSIGNEDTO{csc}
+\end{verbatim}
  
  \section{Drawbacks, missing, \dots}
  
@@ -539,14 +865,17 @@ that avoids backtracking is also presented.
  \subsection{localizzazione errori}
  \ASSIGNEDTO{}
  
-\textbf{Acknowledgements}
+\acknowledgements
  We would like to thank all the students that during the past
  five years collaborated in the \HELM{} project and contributed to 
  the development of Matita, and in particular
  A.Griggio, F.Guidi, P. Di Lena, L.Padovani, I.Schena, M.Selmi, 
  V.Tamburrelli.
  
+\theendnotes
+
  \bibliography{matita}
  
+
  \end{document}