helm/ocaml/cic_notation/doc/body.tex

   1
   2 \section{Introduction}
   3
   4 Mathematical notation plays a fundamental role in mathematical practice: it
   5 helps expressing in a concise symbolic fashion concepts of arbitrary complexity.
   6 Its use in proof assistants like \MATITA{} is no exception. Formal mathematics
   7 indeed often impose to encode mathematical concepts at a very high level of
   8 details (e.g. Peano numbers, implicit arguments) having a restricted toolbox of
   9 syntactic constructions in the calculus.
  10
  11 Consider for example one of the point reached while proving the distributivity
  12 of times over minus on natural numbers included in the \MATITA{} standards
  13 library. (Part of) the reached sequent can be seen in \MATITA{} both using the
  14 notation for various arithmetical and relational operator or without using it.
  15 The sequent rendered without using notation looks as follows:
  16
  17 \sequent{
  18 \mathtt{H}: \mathtt{le} z y\\
  19 \mathtt{Hcut}: \mathtt{eq} \mathtt{nat} (\mathtt{plus} (\mathtt{times} x (\mathtt{minus}
  20 y z)) (\mathtt{times} x z))\\
  21 (\mathtt{plus} (\mathtt{minus} (\mathtt{times} x y) (\mathtt{times} x z))
  22 (\mathtt{times} x z))}{
  23 \mathtt{eq} \mathtt{nat} (\mathtt{times} x (\mathtt{minus} y z)) (\mathtt{minus}
  24 (\mathtt{times} x y) (\mathtt{times} x z))}
  25
  26 while the corresponding sequent rendered with notation enabled looks:
  27
  28 \sequent{
  29 H: z\leq y\\
  30 Hcut: x*(y-z)+x*z=x*y-x*z+x*z}{
  31 x*(y-z)=x*y-x*z}
  32
  33 The latter representation is evidently more readable than the former helping
  34 users both in concentrating on the key aspects of the proof (namely on choosing
  35 the right strategy to proceed in the proof) and in reducing the amount of input
  36 that need to be provided to the system when term input is required (assuming the
  37 exists a correspondence among the rendered output and the textual input syntax
  38 used by the user, as it happens in \MATITA).
  39
  40 In this section we present the \emph{extensible notation} mechanism implemented
  41 in \MATITA. Its role may be looked at from two different point of view: the term
  42 input phase and the term output --- or rendering --- phase. We arbitrarly
  43 decided to call the former view ``from the left'' and the latter ``from the
  44 right''. Looking from the point of view of the input phase it offers a mechanism
  45 of dynamic extension of the term grammar enabling the user to define fancy
  46 mathematical notations.  Looking from the point of view of rendering it enable
  47 the reconstruction of such notations from CIC term and its rendering to various
  48 presentation languages (at the time of writing supported languages are MathML
  49 Presentation and the \MATITA{} concrete syntax for terms).
  50
  51 If you're wondering why the notation mechanisms need to be ``extensible'', the
  52 answer lays in how notation is used in the development of formal mathematics
  53 with proof assistants. When doing ordinary (i.e. non automatically checkable by
  54 the mean of a proof checker) mathematics, notation is often confused with the
  55 mathematical objects being defined. ``+'' may be thought as \emph{the} addition
  56 (and is often termed as such in mathematical textbooks!), but is rather the
  57 notation for one particolar kind of addition which may possibly be used in an
  58 overloaded fashion elsewhere. When doing formal mathematics the difference is
  59 tangible and users has to deal separately with all the actions we skimmed
  60 through:
  61
  62 \begin{enumerate}
  63
  64  \item definition of mathematical objects (e.g. addition over Peano numbers
  65   using the primitive recursion scheme);
  66
  67  \item definition of new mathematical notation (e.g. infix use of the $+$ symbol
  68   as in $x + 3$);
  69
  70  \item (incremental) definition of the meanings of a given notation (e.g. the
  71   use of the notation of (2) above for denoting the addition of (1)).
  72
  73 \end{enumerate}
  74
  75 Since all the points above are part of everyday life of proof assistants users
  76 we know that mathematical notation in the system will change and we can't
  77 provide a ``one-size fits all'' solution as is done for instance in mainstream
  78 programming languages mathematical notation. For this reason \MATITA{} supports
  79 all the above actions in a coherent manner in both term input and output.
  80
  81 \section{Looking from the left: term input}
  82
  83 \subsubsection{\MATITA{} input phase}
  84
  85  \begin{table}
  86   \caption{\label{tab:termsyn} Concrete syntax of CIC terms: built-in
  87   notation\strut}
  88  \hrule
  89  \[
  90  \begin{array}{@{}rcll@{}}
  91    \NT{term} & ::= & & \mbox{\bf terms} \\
  92      &     & x & \mbox{(identifier)} \\
  93      &  |  & n & \mbox{(number)} \\
  94      &  |  & s & \mbox{(symbol)} \\
  95      &  |  & \mathrm{URI} & \mbox{(URI)} \\
  96      &  |  & \verb+_+ & \mbox{(implicit)} \\
  97      &  |  & \verb+?+n~[\verb+[+~\{\NT{subst}\}~\verb+]+] & \mbox{(meta)} \\
  98      &  |  & \verb+let+~\NT{ptname}~\verb+\def+~\NT{term}~\verb+in+~\NT{term} \\
  99      &  |  & \verb+let+~\NT{kind}~\NT{defs}~\verb+in+~\NT{term} \\
 100      &  |  & \NT{binder}~\{\NT{ptnames}\}^{+}~\verb+.+~\NT{term} \\
 101      &  |  & \NT{term}~\NT{term} & \mbox{(application)} \\
 102      &  |  & \verb+Prop+ \mid \verb+Set+ \mid \verb+Type+ \mid \verb+CProp+ & \mbox{(sort)} \\
 103      &  |  & \verb+match+~\NT{term}~ & \mbox{(pattern matching)} \\
 104      &     & ~ ~ [\verb+[+~\verb+in+~x~\verb+]+]
 105               ~ [\verb+[+~\verb+return+~\NT{term}~\verb+]+] \\
 106      &     & ~ ~ \verb+with [+~[\NT{rule}~\{\verb+|+~\NT{rule}\}]~\verb+]+ & \\
 107      &  |  & \verb+(+~\NT{term}~\verb+:+~\NT{term}~\verb+)+ & \mbox{(cast)} \\
 108      &  |  & \verb+(+~\NT{term}~\verb+)+ \\
 109    \NT{defs}  & ::= & & \mbox{\bf mutual definitions} \\
 110      &     & \NT{fun}~\{\verb+and+~\NT{fun}\} \\
 111    \NT{fun} & ::= & & \mbox{\bf functions} \\
 112      &     & \NT{arg}~\{\NT{ptnames}\}^{+}~[\verb+on+~x]~\verb+\def+~\NT{term} \\
 113    \NT{binder} & ::= & & \mbox{\bf binders} \\
 114      &     & \verb+\forall+ \mid \verb+\lambda+ \\
 115    \NT{arg} & ::= & & \mbox{\bf single argument} \\
 116      &     & \verb+_+ \mid x \\
 117    \NT{ptname} & ::= & & \mbox{\bf possibly typed name} \\
 118      &     & \NT{arg} \\
 119      &  |  & \verb+(+~\NT{arg}~\verb+:+~\NT{term}~\verb+)+ \\
 120    \NT{ptnames} & ::= & & \mbox{\bf bound variables} \\
 121      &     & \NT{arg} \\
 122      &  |  & \verb+(+~\NT{arg}~\{\verb+,+~\NT{arg}\}~[\verb+:+~\NT{term}]~\verb+)+ \\
 123    \NT{kind} & ::= & & \mbox{\bf induction kind} \\
 124      &     & \verb+rec+ \mid \verb+corec+ \\
 125    \NT{rule} & ::= & & \mbox{\bf rules} \\
 126      &     & x~\{\NT{ptname}\}~\verb+\Rightarrow+~\NT{term}
 127  \end{array}
 128  \]
 129  \hrule
 130  \end{table}
 131
 132 The primary form of user interaction employed by \MATITA{} is textual script
 133 editing: the user modifies it and evaluate step by step its composing
 134 \emph{statements}. Examples of statements are inductive type definitions,
 135 theorem declarations, LCF-style tacticals, and macros (e.g. \texttt{Check} can
 136 be used to ask the system to refine a given term and pretty print the result).
 137 Since many statements refer to terms of the underlying calculus, \MATITA{} needs
 138 a concrete syntax able to encode terms of the Calculus of Inductive
 139 Constructions.
 140
 141 Two of the requirements in the design of such a syntax are apparently in
 142 contrast:
 143
 144 \begin{enumerate}
 145
 146  \item the syntax should be as close as possible to common mathematical practice
 147   and implement widespread mathematical notations;
 148
 149  \item each term described by the syntax should be non-ambiguous meaning that it
 150   should exists a function which associates to it a CIC term.
 151
 152 \end{enumerate}
 153
 154 These two requirements are addressed in \MATITA{} by the mean of two mechanisms
 155 which work together: \emph{term disambiguation} and \emph{extensible notation}.
 156 Their interaction is visible in the architecture of the \MATITA{} input phase,
 157 depicted in Fig.~\ref{fig:inputphase}. The architecture is articulated as a
 158 pipeline of three levels: the concrete syntax level (level 0) is the one the
 159 user has to deal with when inserting CIC terms; the abstract syntax level (level
 160 2) is an internal representation which intuitively encodes mathematical formulae
 161 at the content level~\cite{adams}\cite{mkm-structure}; the last level is that of
 162 CIC terms.
 163
 164 \begin{figure}[ht]
 165  \begin{center}
 166   \includegraphics[width=0.9\textwidth]{input_phase}
 167   \caption{\MATITA{} input phase}
 168  \end{center}
 169  \label{fig:inputphase}
 170 \end{figure}
 171
 172 Requirement (1) is addressed by a built-in concrete syntax for terms, described
 173 in Tab.~\ref{tab:termsyn}, and the extensible notation mechanisms which offers a
 174 way for extending available mathematical notations and providing a parser for
 175 the extended notation. Requirement (2) is addressed by the conjunct action of
 176 that parsing function and disambiguation which provides a function from content
 177 level terms to CIC terms.
 178
 179 \subsubsection{From concrete syntax to content level}
 180
 181 Content level terms are instances of what are commonly referred as Abstract
 182 Syntax Trees (ASTs) in compilers literature. In this respect the mapping from
 183 concrete syntax fo content level is nothing more than the pipelined application
 184 of a lexer and a parser to the characters that form terms at the concrete syntax
 185 level.
 186
 187 The plus offered by the notation mechanisms is the ability to dinamically extend
 188 the parsing rules which build abstract syntax tree from stream of lexer tokens.
 189 For example, in the standard library of \MATITA{} we found the following
 190 statements which define the notation used for the ``+'' infix operator.
 191
 192 \begin{example}
 193 \begin{Verbatim}
 194  notation "a + b"
 195    left associative with precedence 50
 196  for @{ 'plus $a $b }.
 197 \end{Verbatim}
 198 \end{example}
 199
 200 The meaning of such a statement is to declare a bidirectional
 201 mapping\footnote{in this section we only deal with the left to right part of the
 202 mapping, but it is actually bidirectional} between a concrete syntax pattern
 203 (the part of the statement inside double quotes) and a content level pattern
 204 (the part of the statement which follows \texttt{for}). The syntax of concrete
 205 syntax patterns and content level patterns can be found in Tab.~\ref{tab:l1c}
 206 and Tab.~\ref{tab:l2c} respectively.
 207
 208 \begin{table}
 209 \caption{\label{tab:l1c} Concrete syntax of level 1 patterns.\strut}
 210 \hrule
 211 \[
 212 \begin{array}{rcll}
 213   P & ::= & & \mbox{(\bf patterns)} \\
 214     &     & S^{+} \\[2ex]
 215   S & ::= & & \mbox{(\bf simple patterns)} \\
 216     &     & l \\
 217     &  |  & S~\verb+\sub+~S\\
 218     &  |  & S~\verb+\sup+~S\\
 219     &  |  & S~\verb+\below+~S\\
 220     &  |  & S~\verb+\atop+~S\\
 221     &  |  & S~\verb+\over+~S\\
 222     &  |  & S~\verb+\atop+~S\\
 223     &  |  & \verb+\frac+~S~S \\
 224     &  |  & \verb+\sqrt+~S \\
 225     &  |  & \verb+\root+~S~\verb+\of+~S \\
 226     &  |  & \verb+(+~P~\verb+)+ \\
 227     &  |  & \verb+hbox (+~P~\verb+)+ \\
 228     &  |  & \verb+vbox (+~P~\verb+)+ \\
 229     &  |  & \verb+hvbox (+~P~\verb+)+ \\
 230     &  |  & \verb+hovbox (+~P~\verb+)+ \\
 231     &  |  & \verb+break+ \\
 232     &  |  & \verb+list0+~S~[\verb+sep+~l] \\
 233     &  |  & \verb+list1+~S~[\verb+sep+~l] \\
 234     &  |  & \verb+opt+~S \\
 235     &  |  & [\verb+term+]~x \\
 236     &  |  & \verb+number+~x \\
 237     &  |  & \verb+ident+~x \\
 238 \end{array}
 239 \]
 240 \hrule
 241 \end{table}
 242
 243 \begin{table}
 244 \caption{\label{tab:l1a} Abstract syntax of level 1 terms and patterns.\strut}
 245 \hrule
 246 \[
 247 \begin{array}{@{}ll@{}}
 248 \begin{array}[t]{rcll}
 249   T & ::= & & \mbox{(\bf terms)} \\
 250     &     & L_\kappa[T_1,\dots,T_n] & \mbox{(layout)} \\
 251     &  |  & B_\kappa^{ab}[T_1\cdots T_n] & \mbox{(box)} \\
 252     &  |  & \BREAK & \mbox{(breakpoint)} \\
 253     &  |  & \FENCED{T_1\cdots T_n} & \mbox{(fenced)} \\
 254     &  |  & l & \mbox{(literal)} \\[2ex]
 255   P & ::= & & \mbox{(\bf patterns)} \\
 256     &     & L_\kappa[P_1,\dots,P_n] & \mbox{(layout)} \\
 257     &  |  & B_\kappa^{ab}[P_1\cdots P_n] & \mbox{(box)} \\
 258     &  |  & \BREAK & \mbox{(breakpoint)} \\
 259     &  |  & \FENCED{P_1\cdots P_n} & \mbox{(fenced)} \\
 260     &  |  & M & \mbox{(magic)} \\
 261     &  |  & V & \mbox{(variable)} \\
 262     &  |  & l & \mbox{(literal)} \\
 263 \end{array} &
 264 \begin{array}[t]{rcll}
 265   V & ::= & & \mbox{(\bf variables)} \\
 266     &     & \TVAR{x} & \mbox{(term variable)} \\
 267     &  |  & \NVAR{x} & \mbox{(number variable)} \\
 268     &  |  & \IVAR{x} & \mbox{(name variable)} \\[2ex]
 269   M & ::= & & \mbox{(\bf magic patterns)} \\
 270     &     & \verb+list0+~P~l? & \mbox{(possibly empty list)} \\
 271     &  |  & \verb+list1+~P~l? & \mbox{(non-empty list)} \\
 272     &  |  & \verb+opt+~P & \mbox{(option)} \\[2ex]
 273 \end{array}
 274 \end{array}
 275 \]
 276 \hrule
 277 \end{table}
 278
 279 \begin{table}
 280 \caption{\label{tab:synl2} Concrete syntax of level 2 patterns.\strut}
 281 \hrule
 282 \[
 283 \begin{array}{@{}rcll@{}}
 284   \NT{term} & ::= & & \mbox{\bf terms} \\
 285     &     & x & \mbox{(identifier)} \\
 286     &  |  & n & \mbox{(number)} \\
 287     &  |  & s & \mbox{(symbol)} \\
 288     &  |  & \mathrm{URI} & \mbox{(URI)} \\
 289     &  |  & \verb+?+ & \mbox{(implicit)} \\
 290     &  |  & \verb+%+ & \mbox{(placeholder)} \\
 291     &  |  & \verb+?+n~[\verb+[+~\{\NT{subst}\}~\verb+]+] & \mbox{(meta)} \\
 292     &  |  & \verb+let+~\NT{ptname}~\verb+\def+~\NT{term}~\verb+in+~\NT{term} \\
 293     &  |  & \verb+let+~\NT{kind}~\NT{defs}~\verb+in+~\NT{term} \\
 294     &  |  & \NT{binder}~\{\NT{ptnames}\}^{+}~\verb+.+~\NT{term} \\
 295     &  |  & \NT{term}~\NT{term} & \mbox{(application)} \\
 296     &  |  & \verb+Prop+ \mid \verb+Set+ \mid \verb+Type+ \mid \verb+CProp+ & \mbox{(sort)} \\
 297     &  |  & [\verb+[+~\NT{term}~\verb+]+]~\verb+match+~\NT{term}~\verb+with [+~[\NT{rule}~\{\verb+|+~\NT{rule}\}]~\verb+]+ & \mbox{(pattern match)} \\
 298     &  |  & \verb+(+~\NT{term}~\verb+:+~\NT{term}~\verb+)+ & \mbox{(cast)} \\
 299     &  |  & \verb+(+~\NT{term}~\verb+)+ \\
 300     &  |  & \BLOB(\NT{meta},\dots,\NT{meta}) & \mbox{(meta blob)} \\
 301   \NT{defs}  & ::= & & \mbox{\bf mutual definitions} \\
 302     &     & \NT{fun}~\{\verb+and+~\NT{fun}\} \\
 303   \NT{fun} & ::= & & \mbox{\bf functions} \\
 304     &     & \NT{arg}~\{\NT{ptnames}\}^{+}~[\verb+on+~x]~\verb+\def+~\NT{term} \\
 305   \NT{binder} & ::= & & \mbox{\bf binders} \\
 306     &     & \verb+\Pi+ \mid \verb+\exists+ \mid \verb+\forall+ \mid \verb+\lambda+ \\
 307   \NT{arg} & ::= & & \mbox{\bf single argument} \\
 308     &     & \verb+_+ \mid x \mid \BLOB(\NT{meta},\dots,\NT{meta}) \\
 309   \NT{ptname} & ::= & & \mbox{\bf possibly typed name} \\
 310     &     & \NT{arg} \\
 311     &  |  & \verb+(+~\NT{arg}~\verb+:+~\NT{term}~\verb+)+ \\
 312   \NT{ptnames} & ::= & & \mbox{\bf bound variables} \\
 313     &     & \NT{arg} \\
 314     &  |  & \verb+(+~\NT{arg}~\{\verb+,+~\NT{arg}\}~[\verb+:+~\NT{term}]~\verb+)+ \\
 315   \NT{kind} & ::= & & \mbox{\bf induction kind} \\
 316     &     & \verb+rec+ \mid \verb+corec+ \\
 317   \NT{rule} & ::= & & \mbox{\bf rules} \\
 318     &     & x~\{\NT{ptname}\}~\verb+\Rightarrow+~\NT{term} \\[10ex]
 319
 320   \NT{meta} & ::= & & \mbox{\bf meta} \\
 321     &     & \BLOB(\NT{term},\dots,\NT{term}) & \mbox{(term blob)} \\
 322     &  |  & [\verb+term+]~x \\
 323     &  |  & \verb+number+~x \\
 324     &  |  & \verb+ident+~x \\
 325     &  |  & \verb+fresh+~x \\
 326     &  |  & \verb+anonymous+ \\
 327     &  |  & \verb+fold+~[\verb+left+\mid\verb+right+]~\NT{meta}~\verb+rec+~x~\NT{meta} \\
 328     &  |  & \verb+default+~\NT{meta}~\NT{meta} \\
 329     &  |  & \verb+if+~\NT{meta}~\verb+then+~\NT{meta}~\verb+else+~\NT{meta} \\
 330     &  |  & \verb+fail+
 331 \end{array}
 332 \]
 333 \hrule
 334 \end{table}
 335
 336 Each time a \texttt{notation} statement is evaluated by \MATITA{} a new parsing
 337 rule, extracted from the concrete syntax pattern, is added to the term parser
 338 and a semantic action which build a content level term, extracted from the
 339 content level pattern, is associated to it. We will now describe in turn what
 340 can be part of a concrete syntax pattern and what can be part of a content level
 341 pattern.
 342
 343 Concrete syntax patterns, whose abstract syntax can additionally be found in
 344 Tab.~\ref{tab:l1a} can be made of several components. The most basic of which
 345 are \emph{literal symbols} (like the ``+'' in the example above) and \emph{term
 346 variables} (like ``a'' and ``b''). During the extraction of parsing rules
 347 literal symbols are mapped to productions expecting those symbols verbatim as
 348 input and term variables as production expecting other terms (instances of the
 349 same parsing rule we are extending, possibly with different precedence and/or
 350 associativity).
 351
 352 \ldots
 353
 354 \subsubsection{From content level to CIC}
 355
 356 Responsible of mapping content level terms to CIC terms is the disambiguation
 357 algorithm implemented in \MATITA. Since it has already been described
 358 elsewhere~\cite{disambiguation} we wont enter in too much details here. We only
 359 give some highlights of its fundamental concepts.
 360
 361 \subsubsection{Sources of ambiguity}
 362
 363 The translation from content level terms to CIC terms is not straightforward
 364 because some nodes of the content encoding admit more that one CIC encoding,
 365 invalidating requirement (2).
 366
 367 \begin{example}
 368  \label{ex:disambiguation}
 369
 370  Consider the term at the concrete syntax level \texttt{\TEXMACRO{forall} x. x +
 371  ln 1 = x} of Fig.~\ref{fig:inputphase}(a), it can be the type of a lemma the
 372  user may want to prove. Assuming that both \texttt{+} and \texttt{=} are parsed
 373  as infix operators, all the following questions are legitimate and must be
 374  answered before obtaining a CIC term from its content level encoding
 375  (Fig.~\ref{fig:inputphase}(b)):
 376
 377  \begin{enumerate}
 378
 379   \item Since \texttt{ln} is an unbound identifier, which CIC constants does it
 380    represent? Many different theorems in the library may share its (rather
 381    short) name \dots
 382
 383   \item Which kind of number (\IN, \IR, \dots) the \texttt{1} literal stand for?
 384    Which encoding is used in CIC to represent it? E.g., assuming $1\in\IN$, is
 385    it an unary or a binary encoding?
 386
 387   \item Which kind of equality the ``='' node represents? Is it Leibniz's
 388    polymorhpic equality? Is it a decidable equality over \IN, \IR, \dots?
 389
 390  \end{enumerate}
 391
 392 \end{example}
 393
 394 In \MATITA, three \emph{sources of ambiguity} are admitted for content level
 395 terms: unbound identifiers, literal numbers, and operators. Each instance of
 396 ambiguity sources (ambiguous entity) occuring in a content level term is
 397 associated to a \emph{disambiguation domain}. Intuitively a disambiguation
 398 domain is a set of CIC terms which may be replaced for an ambiguous entity
 399 during disambiguation. Each item of the domain is said to be an
 400 \emph{interpretation} for the ambiguous entity.
 401
 402 \emph{Unbound identifiers} (question 1) are ambiguous entities since the
 403 namespace of CIC objects is not flat and the same identifier may denote many
 404 ofthem. For example the short name \texttt{plus\_assoc} in the \HELM{} library
 405 is shared by three different theorems stating the associative property of
 406 different additions.  This kind of ambiguity is avoidable if the user is willing
 407 to use long names (in form of URIs in the \texttt{cic://} scheme) in the
 408 concrete syntax, with the obvious drawbacks of obtaining long and unreadable
 409 terms.
 410
 411 Given an unbound identifier, the corresponding disambiguation domain is computed
 412 querying the library for all constants, inductive types, and inductive type
 413 constructors having it as their short name (see the \LOCATE{} query in
 414 Sect.~\ref{sec:metadata}).
 415
 416 \emph{Literal numbers} (question 2) are ambiguous entities as well since
 417 different kinds of numbers can be encoded in CIC (\IN, \IR, \IZ, \dots) using
 418 different encodings. Considering the restricted example of natural numbers we
 419 can for instance encode them in CIC using inductive datatypes with a number of
 420 constructor equal to the encoding base plus 1, obtaining one encoding for each
 421 base.
 422
 423 For each possible way of mapping a literal number to a CIC term, \MATITA{} is
 424 aware of a \emph{number intepretation function} which, when applied to the
 425 natural number denoted by the literal\footnote{at the moment only literal
 426 natural number are supported in the concrete syntax} returns a corresponding CIC
 427 term. The disambiguation domain for a given literal number is built applying to
 428 the literal all available number interpretation functions in turn.
 429
 430 Number interpretation functions can at the moment only be defined in OCaml, but
 431 a mechanism to enable their definition directly in \MATITA{} is under
 432 developement.
 433
 434 \emph{Operators} (question 3) are intuitively head of applications, as such they
 435 are always applied to a (possiblt empty) sequence of arguments. Their ambiguity
 436 is a need since it is often the case that some notation is used in an overloaded
 437 fashion to hide the use of different CIC constants which encodes similar
 438 concepts. For example, in the standard library of \MATITA{} the infix \texttt{+}
 439 notation is available building a binary \texttt{Op(+)} node, whose
 440 disambiguation domain may refer to different constants like the addition over
 441 natural numbers \URI{cic:/matita/nat/plus/plus.con} or that over real numbers of
 442 the \COQ{} standard library \URI{cic:/Coq/Reals/Rdefinitions/Rplus.con}.
 443
 444 For each possible way of mapping an operator application to a CIC term,
 445 \MATITA{} knows an \emph{operator interpretation function} which, when applied
 446 to an operator and its arguments, returns a CIC term. The disambiguation domain
 447 for a given operator is built applying to the operator and its arguments all
 448 available operator interpretation functions in turn.
 449
 450 Operator interpretation functions could be added using the
 451 \texttt{interpretation} statement. For example, among the first line of the
 452 script \texttt{matita/library/logic/equality.ma} from the \MATITA{} standard
 453 library we read:
 454
 455 \begin{Verbatim}
 456 interpretation "leibnitz's equality"
 457  'eq x y =
 458    (cic:/matita/logic/equality/eq.ind#xpointer(1/1) _ x y).
 459 \end{Verbatim}
 460
 461 Evaluating it in \MATITA{} will add an operator interpretation function for the
 462 binary operator \texttt{eq} which expands to the CIC term on the right hand side
 463 of the statement. That CIC term can be written using only built-in concrete
 464 syntax, can contain no ambiguity source; still, it can refer to operator
 465 arguments bound on the left hand side and can contain implicit terms (denoted
 466 with \texttt{\_}) which will be expanded to fresh metavariables. The latter
 467 feature is used in the example above for the first argument of Leibniz's
 468 polymorhpic equality.
 469
 470 \subsubsection{Disambiguation algorithm}
 471
 472 A \emph{disambiguation algorithm} takes as input a content level term and return
 473 a fully determined CIC term. The key observation on which a disambiguation
 474 algorithm is based is that given a content level term with more than one sources
 475 of ambiguity, not all possible combination of interpretation lead to a typable
 476 CIC term. In the term of Ex.~\ref{ex:disambiguation} for instance the
 477 interpretation of \texttt{ln} as a function from \IR to \IR and the
 478 interpretation of \texttt{1} as the Peano number $1$ can't coexists. The notion
 479 of ``can't coexists'' in the disambiguation of \MATITA{} is defined on top of
 480 the \emph{refiner} for CIC terms described in~\cite{csc-phd}.
 481
 482 Briefly, a refiner is a function whose input is an \emph{incomplete CIC term}
 483 $t_1$ --- i.e. a term where metavariables occur (Sect.~\ref{sec:metavariables}
 484 --- and whose output is either
 485
 486 \begin{enumerate}
 487
 488  \item an incomplete CIC term $t_2$ where $t_2$ is a well-typed term obtained
 489   assigning a type to each metavariable in $t_1$ (in case of dependent types,
 490   instantiation of some of the metavariable occurring in $t_1$ may occur as
 491   well);
 492
 493  \item $\epsilon$, meaning that no well-typed term could be obtained via
 494   assignment of type to metavariable in $t_1$ and their instantiation;
 495
 496  \item $\bot$, meaning that the refiner is unable to decide whether of the two
 497   cases above apply (refinement is semi-decidable).
 498
 499 \end{enumerate}
 500
 501 On top of a CIC refiner \MATITA{} implement an efficient disambiguation
 502 algorithm, which is outlined below. It takes as input a content level term $c$
 503 and proceeds as follows:
 504
 505 \begin{enumerate}
 506
 507  \item Create disambiguation domains $\{D_i | i\in\mathit{Dom}(c)\}$, where
 508   $\mathit{Dom}(c)$ is the set of ambiguity sources of $c$. Each $D_i$ is a set
 509   of CIC terms and can be built as described above.
 510
 511  \item An \emph{interpretation} $\Phi$ for $c$ is a map associating an
 512   incomplete CIC term to each ambiguity source of $c$. Given $c$ and one of its
 513   interpretations an incomplete CIC term is fully determined replacing each
 514   ambiguity source of $c$ with its mapping in the interpretation and injecting
 515   the remaining structure of the content level in the CIC level (e.g. replacing
 516   the application of the content level with the application of the CIC level).
 517   This operation is informally called ``interpreting $c$ with $\Phi$''.
 518
 519   Create an initial interpretation $\Phi_0 = \{\phi_i | \phi_i = \_,
 520   i\in\mathit{Dom}(c)\}$, which associates a fresh metavariable to each source
 521   of ambiguity of $c$. During this step, implicit terms are expanded to fresh
 522   metavariables as well.
 523
 524  \item Refine the current incomplete CIC term (i.e.  the term obtained
 525   interpreting $t$ with $\Phi_i$).
 526
 527   If the refinement succeeds or is undetermined the next interpretation
 528   $\Phi_{i+1}$ will be created \emph{making a choice}, that is replacing in the
 529   current interpretation one of the metavariable appearing in $\Phi_i$ with one
 530   of the possible choice from the corresponding disambiguation domain. The
 531   metavariable to be replaced is chosen following a preorder visit of the
 532   ambiguous term. Then, step 3 is attempted again with the new interpretation.
 533
 534   If the refinement fails the current set of choices cannot lead to a well-typed
 535   term and backtracking of the current interpretation is attempted.
 536
 537  \item Once an unambiguous correct interpretation is found (i.e. $\Phi_i$ does
 538   no longer contain any placeholder), backtracking is attempted anyway to find
 539   the other correct interpretations.
 540
 541  \item Let $n$ be the number of interpretations who survived step 4. If $n=0$
 542   signal a type error. If $n=1$ we have found exactly one (incomplete) CIC term
 543   corresponding to the content level term $c$, returns it as output of the
 544   disambiguation phase. If $n>1$ we have found many different (incomplete) CIC
 545   terms which can correspond to the content level term, let the user choose one
 546   of the $n$ interpretations and returns the corresponding term.
 547
 548 \end{enumerate}
 549
 550 The efficiency of this algorithm resides in the fact that as soon as an
 551 incomplete CIC term is not typable, no further instantiation of the
 552 metavariables of the corresponding interpretation is attemped.
 553 % For example, during the disambiguation of the user input
 554 % \texttt{\TEXMACRO{forall} x. x*0 = 0}, an interpretation $\Phi_i$ is
 555 % encountered which associates $?$ to the instance of \texttt{0} on the right,
 556 % the real number $0$ to the instance of \texttt{0} on the left, and the
 557 % multiplication over natural numbers (\texttt{mult} for short) to \texttt{*}.
 558 % The refiner will fail, since \texttt{mult} require a natural argument, and no
 559 % further instantiation of the placeholder will be tried.
 560
 561 Details of the disambiguation algorithm along with an analysis of its complexity
 562 can be found in~\cite{disambiguation}, where a formulation without backtracking
 563 (corresponding to the actual \MATITA{} implementation) is also presented.
 564
 565 \subsubsection{Disambiguation stages}
 566
 567 \section{Environment}
 568
 569 \[
 570 \begin{array}{rcll}
 571   V & ::= & & \mbox{(\bf values)} \\
 572     &     & \verb+Term+~T & \mbox{(term)} \\
 573     &  |  & \verb+String+~s & \mbox{(string)} \\
 574     &  |  & \verb+Number+~n & \mbox{(number)} \\
 575     &  |  & \verb+None+ & \mbox{(optional value)} \\
 576     &  |  & \verb+Some+~V & \mbox{(optional value)} \\
 577     &  |  & [V_1,\dots,V_n] & \mbox{(list value)} \\[2ex]
 578 \end{array}
 579 \]
 580
 581 An environment is a map $\mathcal E : \mathit{Name} -> V$.
 582
 583 \section{Level 1: concrete syntax}
 584
 585 Rationale: while the layout schemata can occur in the concrete syntax
 586 used by user, the box schemata and the magic patterns can only occur
 587 when defining the notation. This is why the layout schemata are
 588 ``escaped'' with a backslash, so that they cannot be confused with
 589 plain identifiers, wherease the others are not. Alternatively, they
 590 could be defined as keywords, but this would prevent their names to be
 591 used in different contexts.
 592
 593 \[
 594 \ITO{\cdot}{{}} : P -> \mathit{Env} -> T
 595 \]
 596
 597 \begin{table}
 598 \caption{\label{tab:il1f2} Instantiation of level 1 patterns from level 2.\strut}
 599 \hrule
 600 \[
 601 \begin{array}{rcll}
 602   \ITO{L_\kappa[P_1,\dots,P_n]}{E} & = & L_\kappa[\ITO{(P_1)}{E},\dots,\ITO{(P_n)}{E} ] \\
 603   \ITO{B_\kappa^{ab}[P_1\cdots P_n]}{E} & = & B_\kappa^{ab}[\ITO{P_1}{E}\cdots\ITO{P_n}{E}] \\
 604   \ITO{\BREAK}{E} & = & \BREAK \\
 605   \ITO{(P)}{E} & = & \ITO{P}{E} \\
 606   \ITO{(P_1\cdots P_n)}{E} & = & B_H^{00}[\ITO{P_1}{E}\cdots\ITO{P_n}{E}] \\
 607   \ITO{\TVAR{x}}{E} & = & t & \mathcal{E}(x) = \verb+Term+~t \\
 608   \ITO{\NVAR{x}}{E} & = & l & \mathcal{E}(x) = \verb+Number+~l \\
 609   \ITO{\IVAR{x}}{E} & = & l & \mathcal{E}(x) = \verb+String+~l \\
 610   \ITO{\mathtt{opt}~P}{E} & = & \varepsilon & \mathcal{E}(\NAMES(P)) = \{\mathtt{None}\} \\
 611   \ITO{\mathtt{opt}~P}{E} & = & \ITO{P}{E'} & \mathcal{E}(\NAMES(P)) = \{\mathtt{Some}~v_1,\dots,\mathtt{Some}~v_n\} \\
 612   & & & \mathcal{E}'(x)=\left\{
 613   \begin{array}{@{}ll}
 614     v, & \mathcal{E}(x) = \mathtt{Some}~v \\
 615     \mathcal{E}(x), & \mbox{otherwise}
 616   \end{array}
 617   \right. \\
 618   \ITO{\mathtt{list}k~P~l?}{E} & = & \ITO{P}{{E}_1}~{l?}\cdots {l?}~\ITO{P}{{E}_n} &
 619     \mathcal{E}(\NAMES(P)) = \{[v_{11},\dots,v_{1n}],\dots,[v_{m1},\dots,v_{mn}]\} \\
 620     & & & n\ge k \\
 621   & & & \mathcal{E}_i(x) = \left\{
 622   \begin{array}{@{}ll}
 623     v_i, & \mathcal{E}(x) = [v_1,\dots,v_n] \\
 624     \mathcal{E}(x), & \mbox{otherwise}
 625   \end{array}
 626   \right. \\
 627   \ITO{l}{E} & = & l \\
 628
 629 %%     &  |  & (P) & \mbox{(fenced)} \\
 630 %%     &  |  & M & \mbox{(magic)} \\
 631 %%     &  |  & V & \mbox{(variable)} \\
 632 %%     &  |  & l & \mbox{(literal)} \\[2ex]
 633 %%   V & ::= & & \mbox{(\bf variables)} \\
 634 %%     &     & \TVAR{x} & \mbox{(term variable)} \\
 635 %%     &  |  & \NVAR{x} & \mbox{(number variable)} \\
 636 %%     &  |  & \IVAR{x} & \mbox{(name variable)} \\[2ex]
 637 %%   M & ::= & & \mbox{(\bf magic patterns)} \\
 638 %%     &     & \verb+list0+~S~l? & \mbox{(possibly empty list)} \\
 639 %%     &  |  & \verb+list1+~S~l? & \mbox{(non-empty list)} \\
 640 %%     &  |  & \verb+opt+~S & \mbox{(option)} \\[2ex]
 641 \end{array}
 642 \]
 643 \hrule
 644 \end{table}
 645
 646 \begin{table}
 647 \caption{\label{tab:wfl0} Well-formedness rules for level 1 patterns.\strut}
 648 \hrule
 649 \[
 650 \renewcommand{\arraystretch}{3.5}
 651 \begin{array}[t]{@{}c@{}}
 652   \inference[\sc layout]
 653     {P_i :: D_i & \forall i,j, i\ne j => \DOMAIN(D_i) \cap \DOMAIN(D_j) = \emptyset}
 654     {L_\kappa[P_1,\dots,P_n] :: D_1\oplus\cdots\oplus D_n}
 655   \\
 656   \inference[\sc box]
 657     {P_i :: D_i & \forall i,j, i\ne j => \DOMAIN(D_i) \cap \DOMAIN(D_j) = \emptyset}
 658     {B_\kappa^{ab}[P_1\cdots P_n] :: D_1\oplus\cdots\oplus D_n}
 659   \\
 660   \inference[\sc fenced]
 661     {P_i :: D_i & \forall i,j, i\ne j => \DOMAIN(D_i) \cap \DOMAIN(D_j) = \emptyset}
 662     {\FENCED{P_1\cdots P_n} :: D_1\oplus\cdots\oplus D_n}
 663   \\
 664   \inference[\sc breakpoint]
 665     {}
 666     {\BREAK :: \emptyset}
 667   \qquad
 668   \inference[\sc literal]
 669     {}
 670     {l :: \emptyset}
 671   \qquad
 672   \inference[\sc tvar]
 673     {}
 674     {\TVAR{x} :: \TVAR{x}}
 675   \\
 676   \inference[\sc nvar]
 677     {}
 678     {\NVAR{x} :: \NVAR{x}}
 679   \qquad
 680   \inference[\sc ivar]
 681     {}
 682     {\IVAR{x} :: \IVAR{x}}
 683   \\
 684   \inference[\sc list0]
 685     {P :: D & \forall x\in\DOMAIN(D), D'(x) = D(x)~\mathtt{List}}
 686     {\mathtt{list0}~P~l? :: D'}
 687   \\
 688   \inference[\sc list1]
 689     {P :: D & \forall x\in\DOMAIN(D), D'(x) = D(x)~\mathtt{List}}
 690     {\mathtt{list1}~P~l? :: D'}
 691   \\
 692   \inference[\sc opt]
 693     {P :: D & \forall x\in\DOMAIN(D), D'(x) = D(x)~\mathtt{Option}}
 694     {\mathtt{opt}~P :: D'}
 695 \end{array}
 696 \]
 697 \hrule
 698 \end{table}
 699
 700 \newcommand{\ATTRS}[1]{\langle#1\rangle}
 701 \newcommand{\ANNPOS}[2]{\mathit{pos}(#1)_{#2}}
 702
 703 \begin{table}
 704 \caption{\label{tab:addparens} Can't read the AST and need parentheses? Here you go!.\strut}
 705 \hrule
 706 \[
 707 \begin{array}{rcll}
 708   \ADDPARENS{l}{n} & = & l \\
 709   \ADDPARENS{\BREAK}{n} & = & \BREAK \\
 710   \ADDPARENS{\ATTRS{\mathit{prec}=m}T}{n} & = & \ADDPARENS{T}{m} & n < m \\
 711   \ADDPARENS{\ATTRS{\mathit{prec}=m}T}{n} & = & \FENCED{\ADDPARENS{T}{\bot}} & n > m \\
 712   \ADDPARENS{\ATTRS{\mathit{prec}=n,\mathit{assoc}=L,\mathit{pos}=R}T}{n} & = & \FENCED{\ADDPARENS{T}{\bot}} \\
 713   \ADDPARENS{\ATTRS{\mathit{prec}=n,\mathit{assoc}=R,\mathit{pos}=L}T}{n} & = & \FENCED{\ADDPARENS{T}{\bot}} \\
 714   \ADDPARENS{\ATTRS{\cdots}T}{n} & = & \ADDPARENS{T}{n} \\
 715   \ADDPARENS{L_\kappa[T_1,\dots,\underline{T_k},\dots,T_m]}{n} & = & L_\kappa[\ADDPARENS{T_1}{n},\dots,\ADDPARENS{T_k}{\bot},\dots,\ADDPARENS{T_m}{n}] \\
 716   \ADDPARENS{B_\kappa^{ab}[T_1,\dots,T_m]}{n} & = & B_\kappa^{ab}[\ADDPARENS{T_1}{n},\dots,\ADDPARENS{T_m}{n}]
 717 \end{array}
 718 \]
 719 \hrule
 720 \end{table}
 721
 722 \begin{table}
 723 \caption{\label{tab:annpos} Annotation of level 1 meta variable with position information.\strut}
 724 \hrule
 725 \[
 726 \begin{array}{rcll}
 727   \ANNPOS{l}{p,q} & = & l \\
 728   \ANNPOS{\BREAK}{p,q} & = & \BREAK \\
 729   \ANNPOS{x}{1,0} & = & \ATTRS{\mathit{pos}=L}{x} \\
 730   \ANNPOS{x}{0,1} & = & \ATTRS{\mathit{pos}=R}{x} \\
 731   \ANNPOS{x}{p,q} & = & \ATTRS{\mathit{pos}=I}{x} \\
 732   \ANNPOS{B_\kappa^{ab}[P]}{p,q} & = & B_\kappa^{ab}[\ANNPOS{P}{p,q}] \\
 733   \ANNPOS{B_\kappa^{ab}[\{\BREAK\} P_1\cdots P_n\{\BREAK\}]}{p,q} & = & B_\kappa^{ab}[\begin{array}[t]{@{}l}
 734       \{\BREAK\} \ANNPOS{P_1}{p,0} \\
 735       \ANNPOS{P_2}{0,0}\cdots\ANNPOS{P_{n-1}}{0,0} \\
 736       \ANNPOS{P_n}{0,q}\{\BREAK\}]
 737   \end{array}
 738
 739 %%     &     & L_\kappa[P_1,\dots,P_n] & \mbox{(layout)} \\
 740 %%     &  |  & \BREAK & \mbox{(breakpoint)} \\
 741 %%     &  |  & \FENCED{P_1\cdots P_n} & \mbox{(fenced)} \\
 742 %%   V & ::= & & \mbox{(\bf variables)} \\
 743 %%     &     & \TVAR{x} & \mbox{(term variable)} \\
 744 %%     &  |  & \NVAR{x} & \mbox{(number variable)} \\
 745 %%     &  |  & \IVAR{x} & \mbox{(name variable)} \\[2ex]
 746 %%   M & ::= & & \mbox{(\bf magic patterns)} \\
 747 %%     &     & \verb+list0+~P~l? & \mbox{(possibly empty list)} \\
 748 %%     &  |  & \verb+list1+~P~l? & \mbox{(non-empty list)} \\
 749 %%     &  |  & \verb+opt+~P & \mbox{(option)} \\[2ex]
 750 \end{array}
 751 \]
 752 \hrule
 753 \end{table}
 754
 755 \section{Level 2: abstract syntax}
 756
 757 \begin{table}
 758 \caption{\label{tab:wfl2} Well-formedness rules for level 2 patterns.\strut}
 759 \hrule
 760 \[
 761 \renewcommand{\arraystretch}{3.5}
 762 \begin{array}{@{}c@{}}
 763   \inference[\sc Constr]
 764     {P_i :: D_i}
 765     {\BLOB[P_1,\dots,P_n] :: D_i \oplus \cdots \oplus D_j} \\
 766   \inference[\sc TermVar]
 767     {}
 768     {\mathtt{term}~x :: x : \mathtt{Term}}
 769   \quad
 770   \inference[\sc NumVar]
 771     {}
 772     {\mathtt{number}~x :: x : \mathtt{Number}}
 773   \\
 774   \inference[\sc IdentVar]
 775     {}
 776     {\mathtt{ident}~x :: x : \mathtt{String}}
 777   \quad
 778   \inference[\sc FreshVar]
 779     {}
 780     {\mathtt{fresh}~x :: x : \mathtt{String}}
 781   \\
 782   \inference[\sc Success]
 783     {}
 784     {\mathtt{anonymous} :: \emptyset}
 785   \\
 786   \inference[\sc Fold]
 787     {P_1 :: D_1 & P_2 :: D_2 \oplus (x : \mathtt{Term}) & \DOMAIN(D_2)\ne\emptyset & \DOMAIN(D_1)\cap\DOMAIN(D_2)=\emptyset}
 788     {\mathtt{fold}~P_1~\mathtt{rec}~x~P_2 :: D_1 \oplus D_2~\mathtt{List}}
 789   \\
 790   \inference[\sc Default]
 791     {P_1 :: D \oplus D_1 & P_2 :: D & \DOMAIN(D_1) \ne \emptyset & \DOMAIN(D) \cap \DOMAIN(D_1) = \emptyset}
 792     {\mathtt{default}~P_1~P_2 :: D \oplus D_1~\mathtt{Option}}
 793   \\
 794   \inference[\sc If]
 795     {P_1 :: \emptyset & P_2 :: D & P_3 :: D }
 796     {\mathtt{if}~P_1~\mathtt{then}~P_2~\mathtt{else}~P_3 :: D}
 797   \qquad
 798   \inference[\sc Fail]
 799     {}
 800     {\mathtt{fail} :: \emptyset}
 801 %%     &  |  & \verb+if+~\NT{meta}~\verb+then+~\NT{meta}~\verb+else+~\NT{meta} \\
 802 %%     &  |  & \verb+fail+
 803 \end{array}
 804 \]
 805 \hrule
 806 \end{table}
 807
 808 \begin{table}
 809  \caption{\label{tab:il2f1} Instantiation of level 2 patterns from level 1.
 810  \strut}
 811 \hrule
 812 \[
 813 \begin{array}{rcll}
 814
 815 \IOT{C[t_1,\dots,t_n]}{\mathcal{E}} & =
 816 & C[\IOT{t_1}{\mathcal{E}},\dots,\IOT{t_n}{\mathcal{E}}] \\
 817
 818 \IOT{\mathtt{term}~x}{\mathcal{E}} & = & t & \mathcal{E}(x) = \mathtt{Term}~t \\
 819
 820 \IOT{\mathtt{number}~x}{\mathcal{E}} & =
 821 & n & \mathcal{E}(x) = \mathtt{Number}~n \\
 822
 823 \IOT{\mathtt{ident}~x}{\mathcal{E}} & =
 824 & y & \mathcal{E}(x) = \mathtt{String}~y \\
 825
 826 \IOT{\mathtt{fresh}~x}{\mathcal{E}} & = & y & \mathcal{E}(x) = \mathtt{String}~y \\
 827
 828 \IOT{\mathtt{default}~P_1~P_2}{\mathcal{E}} & =
 829 & \IOT{P_1}{\UPDATE{\mathcal{E}}{x_i|->v_i}}
 830 & \mathcal{E}(x_i)=\mathtt{Some}~v_i \\
 831 & & & \NAMES(P_1)\setminus\NAMES(P_2)=\{x_1,\dots,x_n\} \\
 832
 833 \IOT{\mathtt{default}~P_1~P_2}{\mathcal{E}} & =
 834 & \IOT{P_2}{\UPDATE{\mathcal{E}}{x_i|->\bot}}
 835 & \mathcal{E}(x_i)=\mathtt{None} \\
 836 & & & \NAMES(P_1)\setminus\NAMES(P_2)=\{x_1,\dots,x_n\} \\
 837
 838 \IOT{\mathtt{fold}~\mathtt{right}~P_1~\mathtt{rec}~x~P_2}{\mathcal{E}}
 839 & =
 840 & \IOT{P_1}{\mathcal{E}'}
 841 & \mathcal{E}(\NAMES(P_2)\setminus\{x\}) = \{[],\dots,[]\} \\
 842 & & \multicolumn{2}{l}{\mathcal{E}'=\UPDATE{\mathcal{E}}{\NAMES(P_2)\setminus\{x\}|->\bot}}
 843 \\
 844
 845 \IOT{\mathtt{fold}~\mathtt{right}~P_1~\mathtt{rec}~x~P_2}{\mathcal{E}}
 846 & =
 847 & \IOT{P_2}{\mathcal{E}'}
 848 & \mathcal{E}(y_i) = [v_{i1},\dots,v_{in}] \\
 849 & & & \NAMES(P_2)\setminus\{x\}=\{y_1,\dots,y_m\} \\
 850 & & \multicolumn{2}{l}{\mathcal{E}'(y) =
 851  \left\{
 852  \begin{array}{@{}ll}
 853  \IOT{\mathtt{fold}~\mathtt{right}~P_1~\mathtt{rec}~x~P_e}{\mathcal{E}''}
 854                 & y=x \\
 855  v_{i1}         & y=y_i \\
 856  \mathcal{E}(y) & \mbox{otherwise} \\
 857  \end{array}
 858  \right.} \\
 859 & & \multicolumn{2}{l}{\mathcal{E}''(y) =
 860  \left\{
 861  \begin{array}{@{}ll}
 862  [v_{i2};\dots;v_{in}] & y=y_i \\
 863  \mathcal{E}(y)        & \mbox{otherwise} \\
 864  \end{array}
 865  \right.} \\
 866
 867 \IOT{\mathtt{fold}~\mathtt{left}~P_1~\mathtt{rec}~x~P_2}{\mathcal{E}}
 868 & =
 869 & \mathit{eval\_fold}(x,P_2,\mathcal{E}')
 870 & \\
 871 & & \multicolumn{2}{l}{\mathcal{E}' = \UPDATE{\mathcal{E}}{x|->
 872 \IOT{P_1}{\UPDATE{\mathcal{E}}{\NAMES(P_2)|->\bot}}}} \\
 873
 874 \mathit{eval\_fold}(x,P,\mathcal{E})
 875 & =
 876 & \mathcal{E}(x)
 877 & \mathcal{E}(\NAMES(P)\setminus\{x\})=\{[],\dots,[]\} \\
 878
 879 \mathit{eval\_fold}(x,P,\mathcal{E})
 880 & =
 881 & \mathit{eval\_fold}(x,P,\mathcal{E}')
 882 & \mathcal{E}(y_i) = [v_{i1},\dots,v_{in}] \\
 883 & & & \NAMES(P)\setminus{x}=\{y_1,\dots,y_m\} \\
 884 &
 885 & \multicolumn{2}{l}{\mathcal{E}' = \UPDATE{\mathcal{E}}{x|->\IOT{P}{\mathcal{E}''}; ~ y_i |-> [v_{i2};\dots;v_{in_i}]}}
 886 \\
 887 &
 888 & \multicolumn{2}{l}{\mathcal{E}''(y) =
 889 \left\{
 890 \begin{array}{ll}
 891  v_1            & y\in \NAMES(P)\setminus\{x\} \\
 892  \mathcal{E}(x) & y=x \\
 893  \bot           & \mathit{otherwise} \\
 894 \end{array}
 895 \right.
 896 }
 897 \\
 898
 899 \end{array} \\
 900 \]
 901 \end{table}
 902
 903 \begin{table}
 904 \caption{\label{tab:l2match} Pattern matching of level 2 terms.\strut}
 905 \hrule
 906 \[
 907 \renewcommand{\arraystretch}{3.5}
 908 \begin{array}{@{}c@{}}
 909   \inference[\sc Constr]
 910     {t_i \in P_i ~> \mathcal E_i & i\ne j => \DOMAIN(\mathcal E_i)\cap\DOMAIN(\mathcal E_j)=\emptyset}
 911     {C[t_1,\dots,t_n] \in C[P_1,\dots,P_n] ~> \mathcal E_1 \oplus \cdots \oplus \mathcal E_n}
 912   \\
 913   \inference[\sc TermVar]
 914     {}
 915     {t \in [\mathtt{term}]~x ~> [x |-> \mathtt{Term}~t]}
 916   \quad
 917   \inference[\sc NumVar]
 918     {}
 919     {n \in \mathtt{number}~x ~> [x |-> \mathtt{Number}~n]}
 920   \\
 921   \inference[\sc IdentVar]
 922     {}
 923     {x \in \mathtt{ident}~x ~> [x |-> \mathtt{String}~x]}
 924   \quad
 925   \inference[\sc FreshVar]
 926     {}
 927     {x \in \mathtt{fresh}~x ~> [x |-> \mathtt{String}~x]}
 928   \\
 929   \inference[\sc Success]
 930     {}
 931     {t \in \mathtt{anonymous} ~> \emptyset}
 932   \\
 933   \inference[\sc DefaultT]
 934     {t \in P_1 ~> \mathcal E}
 935     {t \in \mathtt{default}~P_1~P_2 ~> \mathcal E'}
 936     \quad
 937     \mathcal E'(x) = \left\{
 938     \renewcommand{\arraystretch}{1}
 939     \begin{array}{ll}
 940       \mathtt{Some}~\mathcal{E}(x) & x \in \NAMES(P_1) \setminus \NAMES(P_2) \\
 941       \mathcal{E}(x) & \mbox{otherwise}
 942     \end{array}
 943     \right.
 944   \\
 945   \inference[\sc DefaultF]
 946     {t \not\in P_1 & t \in P_2 ~> \mathcal E}
 947     {t \in \mathtt{default}~P_1~P_2 ~> \mathcal E'}
 948     \quad
 949     \mathcal E'(x) = \left\{
 950     \renewcommand{\arraystretch}{1}
 951     \begin{array}{ll}
 952       \mathtt{None} & x \in \NAMES(P_1) \setminus \NAMES(P_2) \\
 953       \mathcal{E}(x) & \mbox{otherwise}
 954     \end{array}
 955     \right.
 956   \\
 957   \inference[\sc IfT]
 958     {t \in P_1 ~> \mathcal E' & t \in P_2 ~> \mathcal E}
 959     {t \in \mathtt{if}~P_1~\mathtt{then}~P_2~\mathtt{else}~P_3 ~> \mathcal E}
 960   \quad
 961   \inference[\sc IfF]
 962     {t \not\in P_1 & t \in P_3 ~> \mathcal E}
 963     {t \in \mathtt{if}~P_1~\mathtt{then}~P_2~\mathtt{else}~P_3 ~> \mathcal E}
 964   \\
 965   \inference[\sc FoldRec]
 966     {t \in P_2 ~> \mathcal E & \mathcal{E}(x) \in \mathtt{fold}~d~P_1~\mathtt{rec}~x~P_2 ~> \mathcal E'}
 967     {t \in \mathtt{fold}~d~P_1~\mathtt{rec}~x~P_2 ~> \mathcal E''}
 968   \\
 969   \mbox{where}~\mathcal{E}''(y) = \left\{
 970     \renewcommand{\arraystretch}{1}
 971     \begin{array}{ll}
 972       \mathcal{E}(y)::\mathcal{E}'(y) & y \in \NAMES(P_2) \setminus \{x\} \wedge d = \mathtt{right} \\
 973       \mathcal{E}'(y)@[\mathcal{E}(y)] & y \in \NAMES(P_2) \setminus \{x\} \wedge d = \mathtt{left} \\
 974       \mathcal{E}'(y) & \mbox{otherwise}
 975     \end{array}
 976   \right.
 977   \\
 978   \inference[\sc FoldBase]
 979     {t \not\in P_2 & t \in P_1 ~> \mathcal E}
 980     {t \in \mathtt{fold}~P_1~\mathtt{rec}~x~P_2 ~> \mathcal E'}
 981   \quad
 982   \mathcal E'(y) = \left\{
 983     \renewcommand{\arraystretch}{1}
 984     \begin{array}{ll}
 985       [] & y \in \NAMES(P_2) \setminus \{x\} \\
 986       \mathcal{E}(y) & \mbox{otherwise}
 987     \end{array}
 988   \right.
 989 \end{array}
 990 \]
 991 \hrule
 992 \end{table}
 993
 994 \begin{table}
 995  \caption{\label{tab:synl3} Abstract syntax of level 3 terms and patterns.}
 996  \hrule
 997  \[
 998  \begin{array}{@{}ll@{}}
 999   \begin{array}[t]{rcll}
1000    T & : := &   & \mbox{(\bf terms)} \\
1001      &      & u & \mbox{(uri)} \\
1002      &  |   & \lambda x.T & \mbox{($\lambda$-abstraction)} \\
1003      &  |   & (T_1 \dots T_n) & \mbox{(application)} \\
1004      &  |   & \dots \\[2ex]
1005   \end{array} &
1006   \begin{array}[t]{rcll}
1007    P & : := &   & \mbox{(\bf patterns)} \\
1008      &      & u & \mbox{(uri)} \\
1009      &  |   & V & \mbox{(variable)} \\
1010      &  |   & (P_1 \dots P_n) & \mbox{(application)} \\[2ex]
1011    V & : := &   & \mbox{(\bf variables)} \\
1012      &      & \TVAR{x} & \mbox{(term variable)} \\
1013      &  |   & \IMPVAR & \mbox{(implicit variable)} \\
1014   \end{array} \\
1015  \end{array}
1016  \]
1017  \hrule
1018 \end{table}
1019
1020 \begin{table}
1021 \caption{\label{tab:wfl3} Well-formedness rules for level 3 patterns.\strut}
1022 \hrule
1023 \[
1024 \renewcommand{\arraystretch}{3.5}
1025 \begin{array}{@{}c@{}}
1026  \inference[\sc Uri] {} {u :: \emptyset} \quad
1027  \inference[\sc ImpVar] {} {\TVAR{x} :: \emptyset} \quad
1028  \inference[\sc TermVar] {} {\TVAR{x} :: x:\mathtt{Term}} \\
1029  \inference[\sc Appl]
1030   {P_i :: D_i
1031    \quad \forall i,j,i\neq j=>\DOMAIN(D_i)\cap\DOMAIN(D_j)=\emptyset}
1032   {P_1\cdots P_n :: D_1\oplus\cdots\oplus D_n} \\
1033 \end{array}
1034 \]
1035 \hrule
1036 \end{table}
1037
1038 \begin{table}
1039  \caption{\label{tab:synargp} Abstract syntax of applicative symbol patterns.}
1040  \hrule
1041  \[
1042  \begin{array}{rcll}
1043   P & : := &           & \mbox{(\bf patterns)} \\
1044     &      & s ~ \{ \mathit{arg} \} & \mbox{(symbol pattern)} \\[2ex]
1045   \mathit{arg} & : := & & \mbox{(\bf argument)} \\
1046     &      & \TVAR{x} & \mbox{(term variable)} \\
1047     &  |   & \eta.\mathit{arg} & \mbox{($\eta$-abstraction)} \\
1048  \end{array}
1049  \]
1050  \hrule
1051 \end{table}
1052
1053 \begin{table}
1054 \caption{\label{tab:wfargp} Well-formedness rules for applicative symbol
1055 patterns.\strut}
1056 \hrule
1057 \[
1058 \renewcommand{\arraystretch}{3.5}
1059 \begin{array}{@{}c@{}}
1060  \inference[\sc Pattern]
1061   {\mathit{arg}_i :: D_i
1062    \quad \forall i,j,i\neq j=>\DOMAIN(D_i)\cap\DOMAIN(D_j)=\emptyset}
1063   {s~\mathit{arg}_1\cdots\mathit{arg}_n :: D_1\oplus\cdots\oplus D_n} \\
1064  \inference[\sc TermVar]
1065   {}
1066   {\TVAR{x} :: x : \mathtt{Term}}
1067  \quad
1068  \inference[\sc EtaAbs]
1069   {\mathit{arg} :: D}
1070   {\eta.\mathit{arg} :: D}
1071  \\
1072 \end{array}
1073 \]
1074 \hrule
1075 \end{table}
1076
1077 \begin{table}
1078 \caption{\label{tab:l3match} Pattern matching of level 3 terms.\strut}
1079 \hrule
1080 \[
1081 \renewcommand{\arraystretch}{3.5}
1082 \begin{array}{@{}c@{}}
1083  \inference[\sc Uri] {} {u\in u ~> []} \quad
1084  \inference[\sc Appl] {t_i\in P_i ~> \mathcal{E}_i}
1085   {(t_1\dots t_n)\in(P_1\dots P_n) ~>
1086    \mathcal{E}_1\oplus\cdots\oplus\mathcal{E}_n} \\
1087  \inference[\sc TermVar] {} {t\in \TVAR{x} ~> [x |-> \mathtt{Term}~t]} \quad
1088  \inference[\sc ImpVar] {} {t\in \IMPVAR ~> []} \\
1089 \end{array}
1090 \]
1091 \hrule
1092 \end{table}
1093
1094 \begin{table}
1095 \caption{\label{tab:iapf3} Instantiation of applicative symbol patterns (from
1096 level 3).\strut}
1097 \hrule
1098 \[
1099 \begin{array}{rcll}
1100  \IAP{s~a_1\cdots a_n}{\mathcal{E}} & = &
1101   (s~\IAPP{a_1}{\mathcal{E}}{0}\cdots\IAPP{a_n}{\mathcal{E}}{0}) & \\
1102  \IAPP{\TVAR{x}}{\mathcal{E}}{0} & = & t & \mathcal{E}(x)=\mathtt{Term}~t \\
1103  \IAPP{\TVAR{x}}{\mathcal{E}}{i+1} & = & \lambda y.\IAPP{t}{\mathcal{E}}{i}
1104   & \mathcal{E}(x)=\mathtt{Term}~\lambda y.t \\
1105  \IAPP{\TVAR{x}}{\mathcal{E}}{i+1} & =
1106   & \lambda y_1.\cdots.\lambda y_{i+1}.t~y_1\cdots y_{i+1}
1107   & \mathcal{E}(x)=\mathtt{Term}~t\wedge\forall y,t\neq\lambda y.t \\
1108  \IAPP{\eta.a}{\mathcal{E}}{i} & = & \IAPP{a}{\mathcal{E}}{i+1} \\
1109 \end{array}
1110 \]
1111 \hrule
1112 \end{table}
1113
1114 \section{Type checking}
1115
1116 \subsection{Level 1 $<->$ Level 2}
1117
1118 \newcommand{\GUARDED}{\mathit{guarded}}
1119 \newcommand{\TRUE}{\mathit{true}}
1120 \newcommand{\FALSE}{\mathit{false}}
1121
1122 \newcommand{\TN}{\mathit{tn}}
1123
1124 \begin{table}
1125 \caption{\label{tab:guarded} Guarded condition of level 2
1126 pattern. Note that the recursive case of the \texttt{fold} magic is
1127 not explicitly required to be guarded. The point is that it must
1128 contain at least two distinct names, and this guarantees that whatever
1129 is matched by the recursive pattern, the terms matched by those two
1130 names will be smaller than the whole matched term.\strut} \hrule
1131 \[
1132 \begin{array}{rcll}
1133   \GUARDED(C(M(P))) & = & \GUARDED(P) \\
1134   \GUARDED(C(t_1,\dots,t_n)) & = & \TRUE \\
1135   \GUARDED(\mathtt{term}~x) & = & \FALSE \\
1136   \GUARDED(\mathtt{number}~x) & = & \FALSE \\
1137   \GUARDED(\mathtt{ident}~x) & = & \FALSE \\
1138   \GUARDED(\mathtt{fresh}~x) & = & \FALSE \\
1139   \GUARDED(\mathtt{anonymous}) & = & \TRUE \\
1140   \GUARDED(\mathtt{default}~P_1~P_2) & = & \GUARDED(P_1) \wedge \GUARDED(P_2) \\
1141   \GUARDED(\mathtt{if}~P_1~\mathtt{then}~P_2~\mathtt{else}~P_3) & = & \GUARDED(P_2) \wedge \GUARDED(P_3) \\
1142   \GUARDED(\mathtt{fail}) & = & \TRUE \\
1143   \GUARDED(\mathtt{fold}~d~P_1~\mathtt{rec}~x~P_2) & = & \GUARDED(P_1)
1144 \end{array}
1145 \]
1146 \hrule
1147 \end{table}
1148
1149 %% Assume that we have two corresponding patterns $P_1$ (level 1) and
1150 %% $P_2$ (level 2) and that we have to check whether they are
1151 %% ``correct''. First we define the notion of \emph{top-level names} of
1152 %% $P_1$ and $P_2$, as follows:
1153 %% \[
1154 %% \begin{array}{rcl}
1155 %%   \TN(C_1[P'_1,\dots,P'_2]) & = & \TN(P'_1) \cup \cdots \cup \TN(P'_2) \\
1156 %%   \TN(\TVAR{x}) & = & \{x\} \\
1157 %%   \TN(\NVAR{x}) & = & \{x\} \\
1158 %%   \TN(\IVAR{x}) & = & \{x\} \\
1159 %%   \TN(\mathtt{list0}~P'~l?) & = & \emptyset \\
1160 %%   \TN(\mathtt{list1}~P'~l?) & = & \emptyset \\
1161 %%   \TN(\mathtt{opt}~P') & = & \emptyset \\[3ex]
1162 %%   \TN(\BLOB(P''_1,\dots,P''_2)) & = & \TN(P''_1) \cup \cdots \cup \TN(P''_2) \\
1163 %%   \TN(\mathtt{term}~x) & = & \{x\} \\
1164 %%   \TN(\mathtt{number}~x) & = & \{x\} \\
1165 %%   \TN(\mathtt{ident}~x) & = & \{x\} \\
1166 %%   \TN(\mathtt{fresh}~x) & = & \{x\} \\
1167 %%   \TN(\mathtt{anonymous}) & = & \emptyset \\
1168 %%   \TN(\mathtt{fold}~P''_1~\mathtt{rec}~x~P''_2) & = & \TN(P''_1) \\
1169 %%   \TN(\mathtt{default}~P''_1~P''_2) & = & \TN(P''_1) \cap \TN(P''_2) \\
1170 %%   \TN(\mathtt{if}~P''_1~\mathtt{then}~P''_2~\mathtt{else}~P''_3) & = & \TN(P''_2) \\
1171 %%   \TN(\mathtt{fail}) & = & \emptyset
1172 %% \end{array}
1173 %% \]
1174
1175 We say that a \emph{bidirectional transformation}
1176 \[
1177   P_1 <=> P_2
1178 \]
1179 is well-formed if:
1180 \begin{itemize}
1181   \item $P_1$ is a well-formed \emph{level 1 pattern} in some context $D$ and
1182   $P_2$ is a well-formed \emph{level 2 pattern} in the very same context $D$,
1183   that is $P_1 :: D$ and $P_2 :: D$;
1184   \item the pattern $P_2$ is guarded, that is $\GUARDED(P_2)=\TRUE$;
1185   \item for any direct sub-pattern $\mathtt{opt}~P'_1$ of $P_1$ such
1186     that $\mathtt{opt}~P'_1 :: X$ there is a sub-pattern
1187     $\mathtt{default}~P'_2~P''_2$ of $P_2$ such that
1188     $\mathtt{default}~P'_2~P''_2 :: X \oplus Y$ for some context $Y$;
1189   \item for any direct sub-pattern $\mathtt{list}~P'_1~l?$ of $P_1$
1190     such that $\mathtt{list}~P'_1~l? :: X$ there is a sub-pattern
1191     $\mathtt{fold}~P'_2~\mathtt{rec}~x~P''_2$ of $P_2$ such that
1192     $\mathtt{fold}~P'_2~\mathtt{rec}~x~P''_2 :: X \oplus Y$ for some
1193     context $Y$.
1194 \end{itemize}
1195
1196 A \emph{left-to-right transformation}
1197 \[
1198   P_1 => P_2
1199 \]
1200 is well-formed if $P_2$ does not contain \texttt{if}, \texttt{fail},
1201 or \texttt{anonymous} meta patterns.
1202
1203 Note that the transformations are in a sense asymmetric. Moving from
1204 the concrete syntax (level 1) to the abstract syntax (level 2) we
1205 forget about syntactic details. Moving from the abstract syntax to the
1206 concrete syntax we may want to forget about redundant structure
1207 (types).
1208
1209 Relationship with grammatical frameworks?
1210
1211 \subsection{Level 2 $<->$ Level 3}
1212
1213 We say that an \emph{interpretation}
1214 \[
1215  P_2 <=> P_3
1216 \]
1217 is well-formed if:
1218 \begin{itemize}
1219  \item $P_2$ is a well-formed \emph{applicative symbol pattern} in some context
1220   $D$ and $P_3$ is a well-formed \emph{level 3 pattern} in the very same
1221   context $D$, that is $P_2 :: D$ and $P_3 :: D$.
1222 \end{itemize}
1223
1224 \section{Semantic selection}
1225