X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;ds=sidebyside;f=helm%2Fpapers%2Fmatita%2Fmatita.tex;h=4a74110b4df53596967c8d3dba418adbf3c1ff49;hb=3b744d6e811f514b800c0dc3b57038f01d4ba8a6;hp=09b1f0b79a80a379f473ac2a8153e28e518f135e;hpb=3cb94e90fc5ee51fed59c9169aa9a08db6389210;p=helm.git diff --git a/helm/papers/matita/matita.tex b/helm/papers/matita/matita.tex index 09b1f0b79..4a74110b4 100644 --- a/helm/papers/matita/matita.tex +++ b/helm/papers/matita/matita.tex @@ -1,9 +1,11 @@ \documentclass[a4paper]{llncs} \pagestyle{headings} +\usepackage{color} \usepackage{graphicx} \usepackage{amssymb,amsmath} \usepackage{hyperref} \usepackage{picins} +\usepackage{fancyvrb} %\newcommand{\logo}[3]{ %\parpic(0cm,0cm)(#2,#3)[l]{\includegraphics[width=#1]{whelp-bw}} @@ -17,6 +19,7 @@ \newcommand{\IN}{\ensuremath{\mathbb{N}}} \newcommand{\INSTANCE}{\textsc{Instance}} \newcommand{\IR}{\ensuremath{\mathbb{R}}} +\newcommand{\IZ}{\ensuremath{\mathbb{Z}}} \newcommand{\LIBXSLT}{LibXSLT} \newcommand{\LOCATE}{\textsc{Locate}} \newcommand{\MATCH}{\textsc{Match}} @@ -33,9 +36,21 @@ \newcommand{\UWOBO}{UWOBO} \newcommand{\WHELP}{Whelp} +\definecolor{gray}{gray}{0.85} % 1 -> white; 0 -> black +\newcommand{\NT}[1]{\langle\mathit{#1}\rangle} +\newcommand{\URI}[1]{\texttt{#1}} + +%{\end{SaveVerbatim}\setlength{\fboxrule}{.5mm}\setlength{\fboxsep}{2mm}% +\newenvironment{grafite}{\VerbatimEnvironment + \begin{SaveVerbatim}{boxtmp}}% + {\end{SaveVerbatim}\setlength{\fboxsep}{3mm}% + \begin{center} + \fcolorbox{black}{gray}{\BUseVerbatim[boxwidth=0.9\linewidth]{boxtmp}} + \end{center}} + \newcommand{\ASSIGNEDTO}[1]{\textbf{Assigned to:} #1} \newcommand{\NOTE}[1]{\marginpar{\scriptsize #1}} -\newcommand{\NT}[1]{\langle\mathit{#1}\rangle} +\newcommand{\TODO}[1]{\textbf{TODO: #1}} \title{The Matita proof assistant} \author{Andrea Asperti, Claudio Sacerdoti Coen, Enrico Tassi @@ -258,7 +273,7 @@ reduce our code in sensible way).\NOTE{righe\\\COQ{}} \subsubsection{Term input} The primary form of user interaction employed by \MATITA{} is textual script -editing: the user can modifies it and evaluate step by step its composing +editing: the user modifies it and evaluate step by step its composing \emph{statements}. Examples of statements are inductive type definitions, theorem declarations, LCF-style tacticals, and macros (e.g. \texttt{Check} can be used to ask the system to refine a given term and pretty print the result). @@ -270,10 +285,9 @@ Two of the requirements in the design of such a syntax are apparently in contrast: \begin{enumerate} \item the syntax should be as close as possible to common mathematical practice - and implement widespread mathematical notions; + and implement widespread mathematical notations; \item each term described by the syntax should be non-ambiguous meaning that it - should exists a function which associates to each term of the syntax a CIC - term. + should exists a function which associates to it a CIC term. \end{enumerate} These two requirements are addressed in \MATITA{} by the mean of two mechanisms @@ -283,8 +297,16 @@ depicted in Fig.~\ref{fig:inputphase}. The architecture is articulated as a pipline of three levels: the concrete syntax level (level 0) is the one the user has to deal with when inserting CIC terms; the abstract syntax level (level 2) is an internal representation which intuitively encodes mathematical formulae at -the content level~\cite{adams}~\cite{mkm-structure}; the formal mathematics -level (level 3) is the CIC encoding of terms. +the content level~\cite{adams}\cite{mkm-structure}; the last level is that of +CIC terms. + +\begin{figure}[ht] + \begin{center} + \includegraphics[width=0.9\textwidth]{input_phase} + \caption{\MATITA{} input phase} + \end{center} + \label{fig:inputphase} +\end{figure} Requirement (1) is addressed by a built-in concrete syntax for terms, described in Tab.~\ref{tab:termsyn}, and the extensible notation mechanisms which offers a @@ -302,10 +324,11 @@ invalidating requirement (2). \begin{example} - Consider the term \texttt{\TEXMACRO{forall} x. x + ln 1 = x}, the type of a - lemma the user may want to prove. Assuming that both \texttt{+} and \texttt{=} - are parsed as infix operators, all the following questions are legitimate and - must be answered before obtaining a CIC term from its content level encoding + Consider the term at the concrete syntax level \texttt{\TEXMACRO{forall} x. x + + ln 1 = x} of Fig.~\ref{fig:inputphase}(a), it can be the type of a lemma the + user may want to prove. Assuming that both \texttt{+} and \texttt{=} are parsed + as infix operators, all the following questions are legitimate and must be + answered before obtaining a CIC term from its content level encoding (Fig.~\ref{fig:inputphase}(b)): \begin{enumerate} @@ -326,16 +349,67 @@ invalidating requirement (2). \end{example} In \MATITA, three \emph{sources of ambiguity} are admitted for content level -terms: unbound identifiers, literal numbers, and literal symbols. - -\emph{Unbound identifiers} (question 1) are sources of ambiguity since the same -name could have been used in the proof assistant library to represent different -objects. \emph{Numbers} (question 2) are ambiguous since several different -encodings of them could be provided in the calculus. Finally, \emph{symbols} -(question 3) are ambiguous as well, since they may be used in an overloaded -fashion to represent the application of different objects. - -\textbf{FINQUI, il resto \`e copy and paste dal Whelp paper \dots} +terms: unbound identifiers, literal numbers, and operators. Each instance of +ambiguity sources (ambiguous entity) occuring in a content level term is +associated to a \emph{disambiguation domain}. Intuitively a disambiguation +domain is a set of CIC terms which may be replaced for an ambiguous entity +during disambiguation. Each item of the domain is said to be an +\emph{interpretation} for the ambiguous entity. + +\emph{Unbound identifiers} (question 1) are ambiguous entities since the +namespace of CIC objects is not flat and the same identifier may denote many +ofthem. For example the short name \texttt{plus\_assoc} in the \HELM{} library +is shared by three different theorems stating the associative property of +different additions. This kind of ambiguity is avoidable if the user is willing +to use long names (in form of URIs in the \texttt{cic://} scheme) in the +concrete syntax, with the obvious drawbacks of obtaining long and unreadable +terms. + +Given an unbound identifier, the corresponding disambiguation domain is computed +querying the library for all constants, inductive types, and inductive type +constructors having it as their short name (see the \LOCATE{} query in +Sect.~\ref{sec:metadata}). + +\emph{Literal numbers} (question 2) are ambiguous entities as well since +different kinds of numbers can be encoded in CIC (\IN, \IR, \IZ, \dots) using +different encodings. Considering the restricted example of natural numbers we +can for instance encode them in CIC using inductive datatypes with a number of +constructor equal to the encoding base plus 1, obtaining one encoding for each +base. + +For each possible way of mapping a literal number to a CIC term, \MATITA{} is +aware of a \emph{number intepretation function} which, when applied to the +natural number denoted by the literal\footnote{at the moment only literal +natural number are supported in the concrete syntax} returns a corresponding CIC +term. The disambiguation domain for a given literal number is built applying to +the literal all available number interpretation functions in turn. + +Number interpretation functions can be defined in OCaml or directly using +\TODO{notazione per i numeri}. + +\emph{Operators} (question 3) are intuitively head of applications, as such they +are always applied to a non empty sequence of arguments. Their ambiguity is a +need since it is often the case that some notation is used in an overloaded +fashion to hide the use of different CIC constants which encodes similar +concepts. For example, in the standard library of \MATITA{} the infix \texttt{+} +notation is available building a binary \texttt{Op(+)} node, whose +disambiguation domain may refer to different constants like the addition over +natural numbers \URI{cic:/matita/nat/plus/plus.con} or that over real numbers of +the \COQ{} standard library \URI{cic:/Coq/Reals/Rdefinitions/Rplus.con}. + +For each possible way of mapping a symbol application to a CIC term, \MATITA{} +knows a \emph{symbol interpretation function} which, when applied to a symbol +and its arguments, returns a CIC term. The disambiguation domain for a given +operator is built applying to the symbol and its arguments all available symbol +interpretation functions in turn. + +\begin{grafite} + foo + bar + baz +\end{grafite} + +\TODO{FINQUI, il resto \`e copy and paste dal Whelp paper \dots} Note that given a content level term with more than one sources of ambiguity, not all possible disambiguation choices are valid: for example, given the input @@ -430,6 +504,7 @@ that avoids backtracking is also presented. \ASSIGNEDTO{csc} \subsection{ricerca e indicizzazione} +\label{sec:metadata} \ASSIGNEDTO{andrea} \subsection{auto}