From 5f4abc5ba58ff12dfabd1acf4d784ff7f242cbf9 Mon Sep 17 00:00:00 2001 From: Enrico Tassi Date: Fri, 25 Nov 2005 10:42:52 +0000 Subject: [PATCH 1/1] added tinycals and patterns subsections --- helm/papers/matita/matita2.tex | 415 +++++++++++++++++++++++++++++++++ 1 file changed, 415 insertions(+) diff --git a/helm/papers/matita/matita2.tex b/helm/papers/matita/matita2.tex index be3cc61bf..80d2a04d5 100644 --- a/helm/papers/matita/matita2.tex +++ b/helm/papers/matita/matita2.tex @@ -356,6 +356,421 @@ data structures (\texttt{extlib}) and central storage for configuration options \texttt{cic} +\section{Partially specified terms} +--- il mondo delle tattiche e dintorni --- +serve una intro che almeno cita il widget (per i patterns) e che fa +il resoconto delle cose che abbiamo e che non descriviamo, +sottolineando che abbiamo qualcosa da dire sui pattern e sui +tattichini.\\ + + + +\subsection{Patterns} +Patterns are the textual counterpart of the MathML widget graphical +selection. + +Matita benefits of a graphical interface and a powerful MathML rendering +widget that allows the user to select pieces of the sequent he is working +on. While this is an extremely intuitive way for the user to +restrict the application of tactics, for example, to some subterms of the +conclusion or some hypothesis, the way this action is recorded to the text +script is not obvious.\\ +In \MATITA{} this issue is addressed by patterns. + +\subsubsection{Pattern syntax} +A pattern is composed of two terms: a $\NT{sequent\_path}$ and a +$\NT{wanted}$. +The former mocks-up a sequent, discharging unwanted subterms with $?$ and +selecting the interesting parts with the placeholder $\%$. +The latter is a term that lives in the context of the placeholders. + +The concrete syntax is reported in table \ref{tab:pathsyn} +\NOTE{uso nomi diversi dalla grammatica ma che hanno + senso} +\begin{table} + \caption{\label{tab:pathsyn} Concrete syntax of \MATITA{} patterns.\strut} +\hrule +\[ +\begin{array}{@{}rcll@{}} + \NT{pattern} & + ::= & [~\verb+in match+~\NT{wanted}~]~[~\verb+in+~\NT{sequent\_path}~] & \\ + \NT{sequent\_path} & + ::= & \{~\NT{ident}~[~\verb+:+~\NT{multipath}~]~\}~ + [~\verb+\vdash+~\NT{multipath}~] & \\ + \NT{wanted} & ::= & \NT{term} & \\ + \NT{multipath} & ::= & \NT{term\_with\_placeholders} & \\ +\end{array} +\] +\hrule +\end{table} + +\subsubsection{How patterns work} +Patterns mimic the user's selection in two steps. The first one +selects roots (subterms) of the sequent, using the +$\NT{sequent\_path}$, while the second +one searches the $\NT{wanted}$ term starting from these roots. Both are +optional steps, and by convention the empty pattern selects the whole +conclusion. + +\begin{description} +\item[Phase 1] + concerns only the $[~\verb+in+~\NT{sequent\_path}~]$ + part of the syntax. $\NT{ident}$ is an hypothesis name and + selects the assumption where the following optional $\NT{multipath}$ + will operate. \verb+\vdash+ can be considered the name for the goal. + If the whole pattern is omitted, the whole goal will be selected. + If one or more hypotheses names are given the selection is restricted to + these assumptions. If a $\NT{multipath}$ is omitted the whole + assumption is selected. Remember that the user can be mostly + unaware of this syntax, since the system is able to write down a + $\NT{sequent\_path}$ starting from a visual selection. + \NOTE{Questo ancora non va in matita} + + A $\NT{multipath}$ is a CiC term in which a special constant $\%$ + is allowed. + The roots of discharged subterms are marked with $?$, while $\%$ + is used to select roots. The default $\NT{multipath}$, the one that + selects the whole term, is simply $\%$. + Valid $\NT{multipath}$ are, for example, $(?~\%~?)$ or $\%~\verb+\to+~(\%~?)$ + that respectively select the first argument of an application or + the source of an arrow and the head of the application that is + found in the arrow target. + + The first phase selects not only terms (roots of subterms) but also + their context that will be eventually used in the second phase. + +\item[Phase 2] + plays a role only if the $[~\verb+in match+~\NT{wanted}~]$ + part is specified. From the first phase we have some terms, that we + will see as subterm roots, and their context. For each of these + contexts the $\NT{wanted}$ term is disambiguated in it and the + corresponding root is searched for a subterm $\alpha$-equivalent to + $\NT{wanted}$. The result of this search is the selection the + pattern represents. + +\end{description} + +\noindent +Since the first step is equipotent to the composition of the two +steps, the system uses it to represent each visual selection. +The second step is only meant for the +experienced user that writes patterns by hand, since it really +helps in writing concise patterns as we will see in the +following examples. + +\subsubsection{Examples} +To explain how the first step works let's give an example. Consider +you want to prove the uniqueness of the identity element $0$ for natural +sum, and that you can relay on the previously demonstrated left +injectivity of the sum, that is $inj\_plus\_l:\forall x,y,z.x+y=z+y \to x =z$. +Typing +\begin{grafite} +theorem valid_name: \forall n,m. m + n = n \to m = O. + intros (n m H). +\end{grafite} +\noindent +leads you to the following sequent +\sequent{ +n:nat\\ +m:nat\\ +H: m + n = n}{ +m=O +} +\noindent +where you want to change the right part of the equivalence of the $H$ +hypothesis with $O + n$ and then use $inj\_plus\_l$ to prove $m=O$. +\begin{grafite} + change in H:(? ? ? %) with (O + n). +\end{grafite} +\noindent +This pattern, that is a simple instance of the $\NT{sequent\_path}$ +grammar entry, acts on $H$ that has type (without notation) $(eq~nat~(m+n)~n)$ +and discharges the head of the application and the first two arguments with a +$?$ and selects the last argument with $\%$. The syntax may seem uncomfortable, +but the user can simply select with the mouse the right part of the equivalence +and left to the system the burden of writing down in the script file the +corresponding pattern with $?$ and $\%$ in the right place (that is not +trivial, expecially where implicit arguments are hidden by the notation, like +the type $nat$ in this example). + +Changing all the occurrences of $n$ in the hypothesis $H$ with $O+n$ +works too and can be done, by the experienced user, writing directly +a simpler pattern that uses the second phase. +\begin{grafite} + change in match n in H with (O + n). +\end{grafite} +\noindent +In this case the $\NT{sequent\_path}$ selects the whole $H$, while +the second phase searches the wanted $n$ inside it by +$\alpha$-equivalence. The resulting +equivalence will be $m+(O+n)=O+n$ since the second phase found two +occurrences of $n$ in $H$ and the tactic changed both. + +Just for completeness the second pattern is equivalent to the +following one, that is less readable but uses only the first phase. +\begin{grafite} + change in H:(? ? (? ? %) %) with (O + n). +\end{grafite} +\noindent + +\subsubsection{Tactics supporting patterns} +In \MATITA{} all the tactics that can be restricted to subterm of the working +sequent accept the pattern syntax. In particular these tactics are: simplify, +change, fold, unfold, generalize, replace and rewrite. + +\NOTE{attualmente rewrite e fold non supportano phase 2. per +supportarlo bisogna far loro trasformare il pattern phase1+phase2 +in un pattern phase1only come faccio nell'ultimo esempio. lo si fa +con una pattern\_of(select(pattern))} + +\subsubsection{Comparison with Coq} +Coq has a two diffrent ways of restricting the application of tactis to +subterms of the sequent, both relaying on the same special syntax to identify +a term occurrence. + +The first way is to use this special syntax to specify directly to the +tactic the occurrnces of a wanted term that should be affected, while +the second is to prepare the sequent with another tactic called +pattern and the apply the real tactic. Note that the choice is not +left to the user, since some tactics needs the sequent to be prepared +with pattern and do not accept directly this special syntax. + +The base idea is that to identify a subterm of the sequent we can +write it and say that we want, for example, the third and the fifth +occurce of it (counting from left to right). In our previous example, +to change only the left part of the equivalence, the correct command +is +\begin{grafite} + change n at 2 in H with (O + n) +\end{grafite} +\noindent +meaning that in the hypothesis $H$ the $n$ we want to change is the +second we encounter proceeding from left toright. + +The tactic pattern computes a +$\beta$-expansion of a part of the sequent with respect to some +occurrences of the given term. In the previous example the following +command +\begin{grafite} + pattern n at 2 in H +\end{grafite} +\noindent +would have resulted in this sequent +\begin{grafite} + n : nat + m : nat + H : (fun n0 : nat => m + n = n0) n + ============================ + m = 0 +\end{grafite} +\noindent +where $H$ is $\beta$-expanded over the second $n$ +occurrence. This is a trick to make the unification algorithm ignore +the head of the application (since the unification is essentially +first-order) but normally operate on the arguments. +This works for some tactics, like rewrite and replace, +but for example not for change and other tactics that do not relay on +unification. + +The idea behind this way of identifying subterms in not really far +from the idea behind patterns, but really fails in extending to +complex notation, since it relays on a mono-dimensional sequent representation. +Real math notation places arguments upside-down (like in indexed sums or +integrations) or even puts them inside a bidimensional matrix. +In these cases using the mouse to select the wanted term is probably the +only way to tell the system exactly what you want to do. + +One of the goals of \MATITA{} is to use modern publishing techiques, and +adopting a method for restricting tactics application domain that discourages +using heavy math notation, would definitively be a bad choice. + +\subsection{Tacticals} +There are mainly two kinds of languages used by proof assistants to recorder +proofs: tactic based and declarative. We will not investigate the philosophy +aroud the choice that many proof assistant made, \MATITA{} included, and we +will not compare the two diffrent approaches. We will describe the common +issues of the tactic-based language approach and how \MATITA{} tries to solve +them. + +\subsubsection{Tacticals overview} + +Tacticals first appeared in LCF and can be seen as programming +constructs, like looping, branching, error recovery or sequential composition. +The following simple example shows three tacticals in action +\begin{grafite} +theorem trivial: + \forall A,B:Prop. + A = B \to ((A \to B) \land (B \to A)). + intros (A B H). + split; intro; + [ rewrite < H. assumption. + | rewrite > H. assumption. + ] +qed. +\end{grafite} + +The first is ``\texttt{;}'' that combines the tactic \texttt{split} +with \texttt{intro}, applying the latter to each goal opened by the +former. Then we have ``\texttt{[}'' that branches on the goals (here +we have two goals, the two sides of the logic and). +The first goal $B$ (with $A$ in the context) +is proved by the first sequence of tactics +\texttt{rewrite} and \texttt{assumption}. Then we move to the second +goal with the separator ``\texttt{|}''. The last tactical we see here +is ``\texttt{.}'' that is a sequential composition that selects the +first goal opened for the following tactic (instead of applying it to +them all like ``\texttt{;}''). Note that usually ``\texttt{.}'' is +not considered a tactical, but a sentence terminator (i.e. the +delimiter of commands the proof assistant executes). + +Giving serious examples here is rather difficult, since they are hard +to read without the interactive tool. To help the reader in +understanding the following considerations we just give few common +usage examples without a proof context. + +\begin{grafite} + elim z; try assumption; [ ... | ... ]. + elim z; first [ assumption | reflexivity | id ]. +\end{grafite} + +The first example goes by induction on a term \texttt{z} and applies +the tactic \texttt{assumption} to each opened goal eventually recovering if +\texttt{assumption} fails. Here we are asking the system to close all +trivial cases and then we branch on the remaining with ``\texttt{[}''. +The second example goes again by induction on \texttt{z} and tries to +close each opened goal first with \texttt{assumption}, if it fails it +tries \texttt{reflexivity} and finally \texttt{id} +that is the tactic that leaves the goal untouched without failing. + +Note that in the common implementation of tacticals both lines are +compositions of tacticals and in particular they are a single +statement (i.e. derived from the same non terminal entry of the +grammar) ended with ``\texttt{.}''. As we will see later in \MATITA{} +this is not true, since each atomic tactic or punctuation is considered +a single statement. + +\subsubsection{Common issues of tactic(als)-based proof languages} +We will examine the two main problems of tactic(als)-based proof script: +maintainability and readability. + +Huge libraries of formal mathematics have been developed, and backward +compatibility is a really time consuming task. \\ +A real-life example in the history of \MATITA{} was the reordering of +goals opened by a tactic application. We noticed that some tactics +were not opening goals in the expected order. In particular the +\texttt{elim} tactic on a term of an inductive type with constructors +$c_1, \ldots, c_n$ used to open goals in order $g_1, g_n, g_{n-1} +\ldots, g_2$. The library of \MATITA{} was still in an embryonic state +but some theorems about integers were there. The inductive type of +$\mathcal{Z}$ has three constructors: $zero$, $pos$ and $neg$. All the +induction proofs on this type where written without tacticals and, +obviously, considering the three induction cases in the wrong order. +Fixing the behavior of the tactic broke the library and two days of +work were needed to make it compile again. The whole time was spent in +finding the list of tactics used to prove the third induction case and +swap it with the list of tactics used to prove the second case. If +the proofs was structured with the branch tactical this task could +have been done automatically. + +From this experience we learned that the use of tacticals for +structuring proofs gives some help but may have some drawbacks in +proof script readability. We must highlight that proof scripts +readability is poor by itself, but in conjunction with tacticals it +can be nearly impossible. The main cause is the fact that in proof +scripts there is no trace of what you are working on. It is not rare +for two different theorems to have the same proof script (while the +proof is completely different).\\ +Bad readability is not a big deal for the user while he is +constructing the proof, but is considerably a problem when he tries to +reread what he did or when he shows his work to someone else. The +workaround commonly used to read a script is to execute it again +step-by-step, so that you can see the proof goal changing and you can +follow the proof steps. This works fine until you reach a tactical. A +compound statement, made by some basic tactics glued with tacticals, +is executed in a single step, while it obviously performs lot of proof +steps. In the fist example of the previous section the whole branch +over the two goals (respectively the left and right part of the logic +and) result in a single step of execution. The workaround doesn't work +anymore unless you de-structure on the fly the proof, putting some +``\texttt{.}'' where you want the system to stop.\\ + +Now we can understand the tradeoff between script readability and +proof structuring with tacticals. Using tacticals helps in maintaining +scripts, but makes it really hard to read them again, cause of the way +they are executed. + +\MATITA{} uses a language of tactics and tacticals, but tries to avoid +this tradeoff, alluring the user to write structured proof without +making it impossible to read them again. + +\subsubsection{The \MATITA{} approach: Tinycals} + +\begin{table} + \caption{\label{tab:tacsyn} Concrete syntax of \MATITA{} tacticals.\strut} +\hrule +\[ +\begin{array}{@{}rcll@{}} + \NT{punctuation} & + ::= & \SEMICOLON \quad|\quad \DOT \quad|\quad \SHIFT \quad|\quad \BRANCH \quad|\quad \MERGE \quad|\quad \POS{\mathrm{NUMBER}~} & \\ + \NT{block\_kind} & + ::= & \verb+focus+ ~|~ \verb+try+ ~|~ \verb+solve+ ~|~ \verb+first+ ~|~ \verb+repeat+ ~|~ \verb+do+~\mathrm{NUMBER} & \\ + \NT{block\_delimiter} & + ::= & \verb+begin+ ~|~ \verb+end+ & \\ + \NT{tactical} & + ::= & \verb+skip+ ~|~ \NT{tactic} ~|~ \NT{block\_delimiter} ~|~ \NT{block\_kind} ~|~ \NT{punctuation} ~|~& \\ +\end{array} +\] +\hrule +\end{table} + +\MATITA{} tacticals syntax is reported in table \ref{tab:tacsyn}. +While one would expect to find structured constructs like +$\verb+do+~n~\NT{tactic}$ the syntax allows pieces of tacticals to be written. +This is essential for base idea behind matita tacticals: step-by-step execution. + +The low-level tacticals implementation of \MATITA{} allows a step-by-step +execution of a tactical, that substantially means that a $\NT{block\_kind}$ is +not executed as an atomic operation. This has two major benefits for the user, +even being a so simple idea: +\begin{description} +\item[Proof structuring] + is much easier. Consider for example a proof by induction, and imagine you + are using classical tacticals in one of the state of the + art graphical interfaces for proof assistant like Proof General or Coq Ide. + After applying the induction principle you have to choose: structure + the proof or not. If you decide for the former you have to branch with + ``\texttt{[}'' and write tactics for all the cases separated by + ``\texttt{|}'' and then close the tactical with ``\texttt{]}''. + You can replace most of the cases by the identity tactic just to + concentrate only on the first goal, but you will have to go one step back and + one further every time you add something inside the tactical. Again this is + caused by the one step execution of tacticals and by the fact that to modify + the already executed script you have to undo one step. + And if you are board of doing so, you will finish in giving up structuring + the proof and write a plain list of tactics.\\ + With step-by-step tacticals you can apply the induction principle, and just + open the branching tactical ``\texttt{[}''. Then you can interact with the + system reaching a proof of the first case, without having to specify any + tactic for the other goals. When you have proved all the induction cases, you + close the branching tactical with ``\texttt{]}'' and you are done with a + structured proof. \\ + While \MATITA{} tacticals help in structuring proofs they allow you to + choose the amount of structure you want. There are no constraints imposed by + the system, and if the user wants he can even write completely plain proofs. + +\item[Rereading] + is possible. Going on step by step shows exactly what is going on. Consider + again a proof by induction, that starts applying the induction principle and + suddenly branches with a ``\texttt{[}''. This clearly separates all the + induction cases, but if the square brackets content is executed in one single + step you completely loose the possibility of rereading it and you have to + temporary remove the branching tactical to execute in a satisfying way the + branches. Again, executing step-by-step is the way you would like to review + the demonstration. Remember that understanding the proof from the script is + not easy, and only the execution of tactics (and the resulting transformed + goal) gives you the feeling of what is going on. +\end{description} + + \acknowledgements We would like to thank all the students that during the past five years collaborated in the \HELM{} project and contributed to -- 2.39.2