From ecae84bd553e50c3182697e994ea28ae2d18d68f Mon Sep 17 00:00:00 2001
From: Enrico Tassi <enrico.tassi@inria.fr>
Date: Mon, 30 Jan 2006 14:13:41 +0000
Subject: [PATCH 1/1] CSC/Occam chain-saw/razor on tinycals and library
 generation/invalidation. minor changes on patters terminology.

---
 helm/papers/matita/matita2.tex | 478 +++++++++++++++++----------------
 1 file changed, 245 insertions(+), 233 deletions(-)

diff --git a/helm/papers/matita/matita2.tex b/helm/papers/matita/matita2.tex
index 81e0c1dee..353066a88 100644
--- a/helm/papers/matita/matita2.tex
+++ b/helm/papers/matita/matita2.tex
@@ -1064,111 +1064,87 @@ user interaction during the disambiguation.
 \subsection{Generation and invalidation}
 \label{sec:libmanagement}
 
-The aim of this section is to describe the way \MATITA{} 
-preserves the consistency and the availability of the library
-using the \WHELP{} technology, in response to the user alteration or 
-removal of mathematical objects.
-
-As already sketched in Sect.~\ref{sec:fullyintro} what we generate 
-from a script is split among two storage media, a
-classical filesystem and a relational database. The former is used to
-store the XML encoding of the objects defined in the script, the
-disambiguation aliases and the interpretation and notational convention defined,
-while the latter is used to store all the metadata needed by
-\WHELP{}.
-
-While the consistency of the data store in the two media has
-nothing to do with the nature of
-the content of the library and is thus uninteresting (but really
-tedious to implement and keep bug-free), there is a deeper
-notion of mathematical consistency we need to provide. Each object
-must reference only defined object (i.e. each proof must use only
-already proved theorems). 
-
-We will focus on how \MATITA{} ensures the interesting kind
-of consistency during the formalization of a mathematical theory, 
+%The aim of this section is to describe the way \MATITA{} 
+%preserves the consistency and the availability of the library
+%using the \WHELP{} technology, in response to the user alteration or 
+%removal of mathematical objects.
+%
+%As already sketched in Sect.~\ref{sec:fullyintro} what we generate 
+%from a script is split among two storage media, a
+%classical filesystem and a relational database. The former is used to
+%store the XML encoding of the objects defined in the script, the
+%disambiguation aliases and the interpretation and notational convention defined,
+%while the latter is used to store all the metadata needed by
+%\WHELP{}.
+%
+%While the consistency of the data store in the two media has
+%nothing to do with the nature of
+%the content of the library and is thus uninteresting (but really
+%tedious to implement and keep bug-free), there is a deeper
+%notion of mathematical consistency we need to provide. Each object
+%must reference only defined object (i.e. each proof must use only
+%already proved theorems). 
+
+In this section we will focus on how \MATITA{} ensures the library 
+consistency during the formalization of a mathematical theory, 
 giving the user the freedom of adding, removing, modifying objects
 without loosing the feeling of an always visible and browsable
 library.
 
-\subsubsection{Compilation}
+\subsubsection{Invalidation}
 
-The typechecker component guarantees that if an object is well typed
-it depends only on well typed objects available in the library,
-that is exactly what we need to be sure that the logic consistency of
-the library is preserved. We have only to find the right order of
-compilation of the scripts that compose the user development.
+Invalidation (see Sect.~\ref{sec:library}) is implemented in two phases.
+
+The first one is the calculation of all the concepts that recursively
+depend on the ones we are invalidating.  The calculation of the
+reverse dependencies can be computed using the relational database
+that stores metadata.
+This technique is the same used by the \emph{Dependency Analyzer}
+and is described in~\cite{zack-master}.
+
+The second phase is the removal of all the results of the generation,
+metadata included.
+
+\subsubsection{Regeneration}
+
+%The typechecker component guarantees that if an object is well typed
+%it depends only on well typed objects available in the library,
+%that is exactly what we need to be sure that the logic consistency of
+%the library is preserved.
+
+To regenerate an invalidated part of the library \MATITA{} re-executes
+the script files that produced the invalidated concepts.  The main 
+problem is to find a suitable order of execution of the scripts.
 
 For this purpose we provide a tool called \MATITADEP{}
-that takes in input the list of files that compose the development and
+that takes in input the list of scripts that compose the development and
 outputs their dependencies in a format suitable for the GNU \texttt{make} tool.
 The user is not asked to run \MATITADEP{} by hand, but
 simply to tell \MATITA{} the root directory of his development (where all
-script files can be found) and \MATITA{} will handle all the compilation
+script files can be found) and \MATITA{} will handle all the generation
 related tasks, including dependencies calculation.
-To compute dependencies it is enough to look at the script files for
-inclusions of other parts of the development or for explicit
-references to other objects (i.e. with explicit aliases, see
-\ref{sec:disambaliases}). 
-
-The output of the compilation is immediately available to the user
-trough the \WHELP{} technology, since all metadata are stored in a
-user-specific area of the database where the search engine has read
-access, and all the automated tactics that operates on the whole
-library, like \AUTO, have full visibility of the newly defined objects.
-
-Compilation is rather simple, and the only tricky case is when we want
-to compile again the same script, maybe after the removal of a
-theorem. Here the policy is simple: clean the output before recompiling.
-As we will see in the next section cleaning will ensure that
-there will be no theorems in the development that depends on the
-removed items.
-
-\subsubsection{Cleaning}
-
-With the term ``cleaning'' we mean the process of removing all the
-results of an object compilation. In order to keep the consistency of
-the library, cleaning an object requires the (recursive) cleaning
-of all the objects that depend on it (\emph{reverse dependencies}).
-
-The calculation of the reverse dependencies can be computed in two
-ways, using the relational database or using a simpler set of metadata
-that \MATITA{} saves in the filesystem as a result of compilation. The
-former technique is the same used by the \emph{Dependency Analyzer}
-described in~\cite{zack-master} and really depends on a relational
-database.
 
-The latter is a fall-back in case the database is not
-available.\footnote{Due to the complex deployment of a large piece of
-software like a database, it is a common practice for the \HELM{} team
-to use a shared remote database, that may be unavailable if the user
-workstation lacks network connectivity.} This facility has to be
-intended only as a fall-back, since the queries of the \WHELP{}
-technology depend require a working database.
+To compute dependencies it is enough to look at the script files for
+disambiguation preferences declared or imported from other scripts
+(see \ref{sec:disambaliases}). 
 
-Cleaning guarantees that if an object is removed there are no dandling
-references to it, and that the part of the library still compiled is
-consistent. Since cleaning involves the removal of all the results of
-the compilation, metadata included, the library browsable trough the
-\WHELP{} technology is always kept up to date.
+Regenerating the content of a modified script file involves the preliminary
+invalidation of all its old content.
 
 \subsubsection{Batch vs Interactive}
 
 \MATITA{} includes an interactive authoring interface and a batch
-``compiler'' (\MATITAC). Only the former is intended to be used directly by the
-user, the latter is automatically invoked when a
-part of the user development is required (for example issuing an
-\texttt{include} command) but not yet compiled.
-
-While they share the same engine for compilation and cleaning, they
-provide different granularity. The batch compiler is only able to
-compile a whole script and similarly to clean only a whole script
-(together with all the other scripts that rely on an object defined in
-it). The interactive interface is able to execute single steps of
-compilation, that may include the definition of an object, and
-similarly to undo single steps. Note that in the latter case there is
-no risk of introducing dangling references since the \MATITA{} user
-interface inhibit undoing a step which is not the last executed.
+``compiler'' (\MATITAC). 
+
+Only the former is intended to be used directly by the
+user, the latter is automatically invoked by \MATITA{}
+to try to regenerate parts of the library previously invalidated.
+
+While they share the same engine for generation and invalidation, they
+provide different granularity. \MATITAC{} is only able to reexecute a
+whole script and similarly to invalidate the whole content of a script
+(together with all the other scripts that rely on an concept defined
+in it). 
 
 \subsection{Automation}
 \label{sec:automation}
@@ -1545,123 +1521,169 @@ using heavy math notation, would definitively be a bad choice.
 \subsection{Tacticals}
 \label{sec:tinycals}
 
-There are mainly two kinds of languages used by proof assistants to recorder
-proofs: tactic based and declarative. We will not investigate the philosophy
-around the choice that many proof assistant made, \MATITA{} included, and we
-will not compare the two different approaches. We will describe the common
-issues of the tactic-based language approach and how \MATITA{} tries to solve
-them.
+%There are mainly two kinds of languages used by proof assistants to recorder
+%proofs: tactic based and declarative. We will not investigate the philosophy
+%around the choice that many proof assistant made, \MATITA{} included, and we
+%will not compare the two different approaches. We will describe the common
+%issues of the tactic-based language approach and how \MATITA{} tries to solve
+%them.
 
-\subsubsection{Tacticals overview}
+The procedural proof language implemented in \MATITA{} is pretty standard,
+with a notable exception for tacticals.
+
+%\subsubsection{Tacticals overview}
+
+Tacticals first appeared in LCF as higher order tactics.  They can be
+seen as control flow constructs, like looping, branching, error
+recovery or sequential composition. 
 
-Tacticals first appeared in LCF and can be seen as programming
-constructs, like looping, branching, error recovery or sequential composition.
-The following simple example shows three tacticals in action
-\begin{grafite}
-theorem trivial: 
-  \forall A,B:Prop. 
-    A = B \to ((A \to B) \land (B \to A)).
-  intros (A B H).
-  split; intro; 
-    [ rewrite < H. assumption.
-    | rewrite > H. assumption.
-    ]
-qed.
-\end{grafite}
 
-The first is ``\texttt{;}'' that combines the tactic \texttt{split}
-with \texttt{intro}, applying the latter to each goal opened by the
-former. Then we have ``\texttt{[}'' that branches on the goals (here
-we have two goals, the two sides of the logic and).
-The first goal $B$ (with $A$ in the context)
-is proved by the first sequence of tactics
-\texttt{rewrite} and \texttt{assumption}. Then we move to the second
-goal with the separator ``\texttt{|}''. The last tactical we see here
-is ``\texttt{.}'' that is a sequential composition that selects the
-first goal opened for the following tactic (instead of applying it to
-them all like ``\texttt{;}''). Note that usually ``\texttt{.}'' is
-not considered a tactical, but a sentence terminator (i.e. the
-delimiter of commands the proof assistant executes).
-
-Giving serious examples here is rather difficult, since they are hard
-to read without the interactive tool. To help the reader in
-understanding the following considerations we just give few common
-usage examples without a proof context.
+The following simple example
+shows a Coq script made of four dot-terminated commands
 
 \begin{grafite}
-  elim z; try assumption; [ ... | ... ].
-  elim z; first [ assumption | reflexivity | id ].
+Theorem trivial: 
+  forall A B:Prop,
+    A = B -> ((A -> B) /\ (B -> A)).
+  intros [A B H].
+  split; intro; 
+    [ rewrite < H; assumption
+    | rewrite > H; assumption
+    ].
+Qed.
 \end{grafite}
 
-The first example goes by induction on a term \texttt{z} and applies
-the tactic \texttt{assumption} to each opened goal eventually recovering if
-\texttt{assumption} fails. Here we are asking the system to close all
-trivial cases and then we branch on the remaining with ``\texttt{[}''.
-The second example goes again by induction on \texttt{z} and tries to
-close each opened goal first with \texttt{assumption}, if it fails it
-tries \texttt{reflexivity} and finally \texttt{id}
-that is the tactic that leaves the goal untouched without failing. 
-
-Note that in the common implementation of tacticals both lines are
-compositions of tacticals and in particular they are a single
-statement (i.e. derived from the same non terminal entry of the
-grammar) ended with ``\texttt{.}''. As we will see later in \MATITA{}
-this is not true, since each atomic tactic or punctuation is considered 
-a single statement.
+The third command is an application of the sequencing tactical
+``$\ldots$\texttt{;}$\ldots$'', that combines the tactic
+\texttt{split} with the application of the branching tactical
+``$\ldots$\texttt{;[}$\ldots$\texttt{|}$\ldots$\texttt{|}$\ldots$\texttt{]}''
+to other tactics and tacticals.
+
+The usual implementation of tacticals executes them atomically as any
+other command. In \MATITA{} thi is not true since each punctuation is
+executed as a single command.
+
+%The latter is applied to all the goals opened by \texttt{split}
+%
+%(here we have two goals, the two sides of the logic and).  The first
+%goal $B$ (with $A$ in the context) is proved by the first sequence of
+%tactics \texttt{rewrite} and \texttt{assumption}. Then we move to the
+%second goal with the separator ``\texttt{|}''. 
+%
+%Giving serious examples here is rather difficult, since they are hard
+%to read without the interactive tool. To help the reader in
+%understanding the following considerations we just give few common
+%usage examples without a proof context.
+%
+%\begin{grafite}
+%  elim z; try assumption; [ ... | ... ].
+%  elim z; first [ assumption | reflexivity | id ].
+%\end{grafite}
+%
+%The first example goes by induction on a term \texttt{z} and applies
+%the tactic \texttt{assumption} to each opened goal eventually recovering if
+%\texttt{assumption} fails. Here we are asking the system to close all
+%trivial cases and then we branch on the remaining with ``\texttt{[}''.
+%The second example goes again by induction on \texttt{z} and tries to
+%close each opened goal first with \texttt{assumption}, if it fails it
+%tries \texttt{reflexivity} and finally \texttt{id}
+%that is the tactic that leaves the goal untouched without failing. 
+%
+%Note that in the common implementation of tacticals both lines are
+%compositions of tacticals and in particular they are a single
+%statement (i.e. derived from the same non terminal entry of the
+%grammar) ended with ``\texttt{.}''. As we will see later in \MATITA{}
+%this is not true, since each atomic tactic or punctuation is considered 
+%a single statement.
 
 \subsubsection{Common issues of tactic(als)-based proof languages}
 We will examine the two main problems of tactic(als)-based proof script:
 maintainability and readability. 
 
-Huge libraries of formal mathematics have been developed, and backward
-compatibility is a really time consuming task. \\
-A real-life example in the history of \MATITA{} was the reordering of
-goals opened by a tactic application. We noticed that some tactics
-were not opening goals in the expected order. In particular the
-\texttt{elim} tactic on a term of an inductive type with constructors
-$c_1, \ldots, c_n$ used to open goals in order $g_1, g_n, g_{n-1}
-\ldots, g_2$. The library of \MATITA{} was still in an embryonic state
-but some theorems about integers were there. The inductive type of
-$\mathcal{Z}$ has three constructors: $zero$, $pos$ and $neg$. All the
-induction proofs on this type where written without tacticals and,
-obviously, considering the three induction cases in the wrong order.
-Fixing the behavior of the tactic broke the library and two days of
-work were needed to make it compile again. The whole time was spent in
-finding the list of tactics used to prove the third induction case and
-swap it with the list of tactics used to prove the second case.  If
-the proofs was structured with the branch tactical this task could
-have been done automatically. 
-
-From this experience we learned that the use of tacticals for
-structuring proofs gives some help but may have some drawbacks in
-proof script readability. We must highlight that proof scripts
-readability is poor by itself, but in conjunction with tacticals it
-can be nearly impossible. The main cause is the fact that in proof
-scripts there is no trace of what you are working on. It is not rare
-for two different theorems to have the same proof script (while the
-proof is completely different).\\
-Bad readability is not a big deal for the user while he is
-constructing the proof, but is considerably a problem when he tries to
-reread what he did or when he shows his work to someone else.  The
-workaround commonly used to read a script is to execute it again
-step-by-step, so that you can see the proof goal changing and you can
-follow the proof steps. This works fine until you reach a tactical.  A
-compound statement, made by some basic tactics glued with tacticals,
-is executed in a single step, while it obviously performs lot of proof
-steps.  In the fist example of the previous section the whole branch
-over the two goals (respectively the left and right part of the logic
-and) result in a single step of execution. The workaround does not work
-anymore unless you de-structure on the fly the proof, putting some
-``\texttt{.}'' where you want the system to stop.\\
-
-Now we can understand the tradeoff between script readability and
-proof structuring with tacticals. Using tacticals helps in maintaining
-scripts, but makes it really hard to read them again, cause of the way
-they are executed.
-
-\MATITA{} uses a language of tactics and tacticals, but tries to avoid
-this tradeoff, alluring the user to write structured proof without
-making it impossible to read them again.
+%Huge libraries of formal mathematics have been developed, and backward
+%compatibility is a really time consuming task. \\
+%A real-life example in the history of \MATITA{} was the reordering of
+%goals opened by a tactic application. We noticed that some tactics
+%were not opening goals in the expected order. In particular the
+%\texttt{elim} tactic on a term of an inductive type with constructors
+%$c_1, \ldots, c_n$ used to open goals in order $g_1, g_n, g_{n-1}
+%\ldots, g_2$. The library of \MATITA{} was still in an embryonic state
+%but some theorems about integers were there. The inductive type of
+%$\mathcal{Z}$ has three constructors: $zero$, $pos$ and $neg$. All the
+%induction proofs on this type where written without tacticals and,
+%obviously, considering the three induction cases in the wrong order.
+%Fixing the behavior of the tactic broke the library and two days of
+%work were needed to make it compile again. The whole time was spent in
+%finding the list of tactics used to prove the third induction case and
+%swap it with the list of tactics used to prove the second case.  If
+%the proofs was structured with the branch tactical this task could
+%have been done automatically. 
+%
+%From this experience we learned that the use of tacticals for
+%structuring proofs gives some help but may have some drawbacks in
+%proof script readability. 
+
+Tacticals are not only used to make scripts shorter by factoring out
+common cases and repeating commands. They are a primary way of making
+scripts more mainteable. Moreover, they also have the well-known
+role of structuring the proof.
+
+However, authoring a proof structured with tacticals is annoying.
+Consider for example a proof by induction, and imagine you
+are using one of the state of the art graphical interfaces for proof assistant
+like Proof General. After applying the induction principle you have to choose:
+immediately structure the proof or postpone the structuring.
+If you decide for the former you have to apply the branching tactical and write
+at once tactics for all the cases. Since the user does not even know the
+generated goals yet, she can only replace all the cases with the identity
+tactic and execute the command, just to receive feedback on the first
+goal. Then she has to go one step back to replace the first identity
+tactic with the wanted one and repeat the process until all the
+branches are closed.
+
+One could imagine that a structured script is simpler to understand.
+This is not the case.
+A proof script, being not declarative, is not meant to be read.
+However, the user has the need of explaining it to others.
+This is achieved by interactively re-playing the script to show each
+intermediate proof status. Tacticals make this operation uncomfortable.
+Indeed, a tactical is executed atomically, while it is obvious that it
+performs lot of smaller steps we are interested in.
+To show the intermediate steps, the proof must be de-structured on the
+fly, for example replacing ``\texttt{;}'' with ``\texttt{.}'' where
+possible.\\
+
+%Proof scripts
+%readability is poor by itself, but in conjunction with tacticals it
+%can be nearly impossible. The main cause is the fact that in proof
+%scripts there is no trace of what you are working on. It is not rare
+%for two different theorems to have the same proof script.\\
+%Bad readability is not a big deal for the user while he is
+%constructing the proof, but is considerably a problem when he tries to
+%reread what he did or when he shows his work to someone else.  The
+%workaround commonly used to read a script is to execute it again
+%step-by-step, so that you can see the proof goal changing and you can
+%follow the proof steps. This works fine until you reach a tactical.  A
+%compound statement, made by some basic tactics glued with tacticals,
+%is executed in a single step, while it obviously performs lot of proof
+%steps.  In the fist example of the previous section the whole branch
+%over the two goals (respectively the left and right part of the logic
+%and) result in a single step of execution. The workaround does not work
+%anymore unless you de-structure on the fly the proof, putting some
+%``\texttt{.}'' where you want the system to stop.\\
+
+%Now we can understand the tradeoff between script readability and
+%proof structuring with tacticals. Using tacticals helps in maintaining
+%scripts, but makes it really hard to read them again, cause of the way
+%they are executed.
+
+\MATITA{} has a peculiar tacticals implementation that provides the
+same benefits as classical tacticals, while not burdening the user
+during proof authoring and re-playing.
+
+%\MATITA{} uses a language of tactics and tacticals, but tries to avoid
+%this tradeoff, alluring the user to write structured proof without
+%making it impossible to read them again.
 
 \subsubsection{The \MATITA{} approach: Tinycals}
 
@@ -1687,51 +1709,41 @@ making it impossible to read them again.
 \MATITA{} tacticals syntax is reported in Tab.~\ref{tab:tacsyn}.
 While one would expect to find structured constructs like 
 $\verb+do+~n~\NT{tactic}$ the syntax allows pieces of tacticals to be written.
-This is essential for base idea behind \MATITA{} tacticals: step-by-step
+This is essential for the base idea behind \MATITA{} tacticals: step-by-step
 execution.
 
 The low-level tacticals implementation of \MATITA{} allows a step-by-step
 execution of a tactical, that substantially means that a $\NT{block\_kind}$ is
-not executed as an atomic operation. This has two major benefits for the user,
-even being a so simple idea:
-\begin{description}
-\item[Proof structuring] 
-  is much easier. Consider for example a proof by induction, and imagine you
-  are using classical tacticals in one of the state of the
-  art graphical interfaces for proof assistant like Proof General or \COQIDE.
-  After applying the induction principle you have to choose: structure
-  the proof or not. If you decide for the former you have to branch with
-  ``\texttt{[}'' and write tactics for all the cases separated by 
-  ``\texttt{|}'' and then close the tactical with ``\texttt{]}''. 
-  You can replace most of the cases by the identity tactic just to
-  concentrate only on the first goal, but you will have to go one step back and
-  one further every time you add something inside the tactical. Again this is
-  caused by the one step execution of tacticals and by the fact that to modify
-  the already executed script you have to undo one step.
-  And if you are board of doing so, you will finish in giving up structuring
-  the proof and write a plain list of tactics.\\
-  With step-by-step tacticals you can apply the induction principle, and just
-  open the branching tactical ``\texttt{[}''. Then you can interact with the
-  system reaching a proof of the first case, without having to specify any
-  tactic for the other goals. When you have proved all the induction cases, you
-  close the branching tactical with ``\texttt{]}'' and you are done with a 
-  structured proof. \\
-  While \MATITA{} tacticals help in structuring proofs they allow you to 
-  choose the amount of structure you want. There are no constraints imposed by
-  the system, and if the user wants he can even write completely plain proofs.
+not executed as an atomic operation. This has major benefits for the
+user during proof structuring and re-playing.
+
+For instance, reconsider the previous example of a proof by induction.
+With step-by-step tacticals the user can apply the induction principle, and just
+open the branching tactical ``\texttt{[}''. Then she can interact with the
+system until the proof of the first case is terminated. After that
+``\texttt{|}'' is used to move to the next goal, until all goals are
+closed. After the last goal, the user closes the branching tactical with
+``\texttt{]}'' and is done with a structured proof. \\
+While \MATITA{} tacticals help in structuring proofs they allow you to 
+choose the amount of structure you want. There are no constraints imposed by
+the system, and if the user wants he can even write completely plain proofs.
   
-\item[Rereading]
-  is possible. Going on step by step shows exactly what is going on.  Consider
-  again a proof by induction, that starts applying the induction principle and
-  suddenly branches with a ``\texttt{[}''. This clearly separates all the
-  induction cases, but if the square brackets content is executed in one single
-  step you completely loose the possibility of rereading it and you have to
-  temporary remove the branching tactical to execute in a satisfying way the
-  branches.  Again, executing step-by-step is the way you would like to review
-  the demonstration. Remember that understanding the proof from the script is
-  not easy, and only the execution of tactics (and the resulting transformed
-  goal) gives you the feeling of what is going on.
-\end{description}
+Re-playing a proof is also made simpler. There is no longer any need
+to destructure the proof on the fly since \MATITA{} executes each
+tactical not atomically.
+
+%\item[Rereading]
+%  is possible. Going on step by step shows exactly what is going on.  Consider
+%  again a proof by induction, that starts applying the induction principle and
+%  suddenly branches with a ``\texttt{[}''. This clearly separates all the
+%  induction cases, but if the square brackets content is executed in one single
+%  step you completely loose the possibility of rereading it and you have to
+%  temporary remove the branching tactical to execute in a satisfying way the
+%  branches.  Again, executing step-by-step is the way you would like to review
+%  the demonstration. Remember that understanding the proof from the script is
+%  not easy, and only the execution of tactics (and the resulting transformed
+%  goal) gives you the feeling of what is going on.
+%\end{description}
 
 \section{Standard library}
 \label{sec:stdlib}
-- 
2.39.5