X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2FDEVEL%2Fmathml_editor%2Fdoc%2Fspec.tex;h=be9ba51bb68495a7dd80a18f213ae626b790cdb5;hb=ced0086b4bb5b10e6c4a8268ea781624a41394dd;hp=1c20a15649461cb4dc93cbc07df799c1e7f3ebe8;hpb=e423f7f8b93179733ed81ee4cf6f8f2930e31d42;p=helm.git diff --git a/helm/DEVEL/mathml_editor/doc/spec.tex b/helm/DEVEL/mathml_editor/doc/spec.tex index 1c20a1564..be9ba51bb 100644 --- a/helm/DEVEL/mathml_editor/doc/spec.tex +++ b/helm/DEVEL/mathml_editor/doc/spec.tex @@ -5,11 +5,14 @@ \usepackage{euler} \usepackage{amssymb} \usepackage{stmaryrd} +\usepackage{wasysym} -\title{A MathML Editor Based on \TeX{} Syntax\\Formal Specification} +\title{\EdiTeX: a MathML Editor Based on \TeX{} Syntax\\\small Description and Formal Specification} \author{Paolo Marinelli\\Luca Padovani\\\small\{{\tt pmarinel},{\tt lpadovan}\}{\tt @cs.unibo.it}\\\small Department of Computer Science\\\small University of Bologna} \date{} +\newcommand{\EdiTeX}{Edi\TeX} + \newcommand{\tmap}[1]{\llbracket#1\rrbracket} \newcommand{\tadvance}{\vartriangle} \newcommand{\tnext}{\rhd} @@ -27,11 +30,112 @@ \newcommand{\ROW}{\texttt{row}} \newcommand{\SLDROP}{\blacktriangleleft} \newcommand{\NLDROP}{\vartriangleleft} +\newcommand{\RDROP}{\vartriangleright} \begin{document} \maketitle +\section{Introduction} + +MathML~\cite{MathML1,MathML2,MathML2E} is an XML application for the +representation of mathematical expressions. As most XML applications, +MathML is unsuitable to be hand-written, except for the simplest +cases, because of its verbosity. In fact, the MathML specification +explicitly states that +\begin{quote} +``While MathML is human-readable, it is anticipated that, in all but +the simplest cases, authors will use equation editors, conversion +programs, and other specialized software tools to generate MathML'' +\end{quote} + +The statement about human readability of MathML is already too strong, +as the large number of mathematical symbols, operators, and +diacritical marks that are used in mathematical notation cause MathML +documents to make extensive use of Unicode characters that typically +are not in the ``visible'' range of common text editors. Such +characters may appear as entity references, whose name indicates +somehow the kind of symbol used, or character references or they are +directly encoded in the document encoding scheme (for instance, +UTF-8). + +It is thus obvious that authoring MathML documents assumes the +assistance of dedicated tools. As of today, such tools can be +classified into two main categories: +\begin{enumerate} + \item WYSIWYG (What You See Is What You Get) editors that allow the + author to see the formatted document on the screen as it is + composed; + \item conversion tools that generate MathML markup from different + sources, typically other markup languages for scientific + documents, such as \TeX. +\end{enumerate} + +While the former tools are certainly more appealing, especially to the +unexperienced user, as they give a direct visual feedback, the +existance of tools in the second category takes into account the large +availability of existing documents in \TeX{} format, and also the fact +that experienced or ``lazy'' users may continue to prefer the use of a +markup language other than MathML for editing, and generate MathML +only as a final step of the authoring process. The ``laziness'' is not +really intended as a way of being reluctant towards a new technology, +but rather as a justified convincement that WYSIWYG editors are ``nice +to look at'' but after all they may slow down the authoring process. +WYSIWYG editors often involve the use of menus, palettes of symbols, +and, in general, an extensive use of the pointing device (the mouse) +for completing most operations. The use of shortcuts is of little +help, as it implies very soon a challenging exercise for the fingers +and the mind. Moreover, authors \emph{cannot improve} their authoring +speed with time. On the other side, the gap between the syntax of any +markup language for mathematics and mathematical notation may be +relevant, especially for large, non-trivial formulas and authoring is +a re-iterated process in which the author repeadtedly types the markup +in the editor, compiles, and looks at the result inside a pre-viewer. + +\EdiTeX{} tries to synthesize the ``best of both worlds'' in a single +tool. The basic idea is that of creating a WYSIWYG editor in which +editing is achieved by typing \TeX{} markup as the author would do in +a text editor. The \TeX{} markup is tokenized and parsed on-the-fly +and a corresponding MathML representation is created and +displayed. This way, the author can see the rendered document as it +changes. The advantages of this approach can be summarized as follows: +\begin{itemize} + \item the document is rendered concurrently with the editing, the + user has an immediate feedback hence it is easier to spot errors; + \item the author types in a concrete (and likely familiar) syntax + improving the editing speed; + \item the usual WYSIWYG mechanisms are still available. In + particular, it is possible to select \emph{visually} a fragment of + the document that needs re-editing, or that was left behind for + subsequent editing. +\end{itemize} + +\paragraph{The Name of the Game:} there is no reference to MathML in +the name ``\EdiTeX.'' In fact, the architecture of the editor is not +tied to MathML markup. Although we focus on MathML editing, by +changing a completely modularized component of the editor it is +virtually possible to generate any other markup language. + +\paragraph{Acknowledgments.} Stephen M. Watt and Igor Rodionov for +their work on the \TeX{} to MathML conversion tool; Stan Devitt for an +illuminating discussion about the architecture of \TeX{} to XML +conversion tools; Claudio Sacerdoti Coen for the valuable feedback and +uncountable bug reports. + +\section{Architecture} + +\section{Customization} + +\subsection{Short and Long Identifiers} + +\subsection{The Dictionary} + +\subsection{Stylesheets and Trasformations} + +\subsection{Rendering} + +\section{XML Representation of \TeX{} Markup} + \section{Tokens} The following tokens are defined: @@ -77,7 +181,7 @@ The following tokens are defined: \end{tabular} %$ -%% \section{Description and Semantics of the Pattern Language} +\section{Description and Semantics of the Pattern Language} %% \begin{eqnarray*} %% \mathit{NodeTest} & ::= & \mathtt{*} \\ @@ -93,6 +197,28 @@ The following tokens are defined: %% & | & \mathit{AttributeName}\mathtt{='}\mathit{Text}\mathtt{'} %% \end{eqnarray*} +\begin{table} +\[ +\begin{array}{rcl@{\hspace{3em}}rcl@{\hspace{3em}}rcl} + C &::=& Q\verb+/+ & Q &::=& T & P &::=& P' \\ + &|& Q\verb+//+ & &|& Q\verb+[@+n\verb+=+v\verb+]+ & &|& \verb+^+P'\\ + &|& \verb+(+C\verb+)+ & &|& Q\verb+[!@+n\verb+=+v\verb+]+ & &|& P'\verb+$+\\%$ + &|& (\alpha\verb+=+C) & &|& Q\verb+[@+n\verb+]+ & &|& \verb+^+P'\verb+$+\\%$ + &|& C_1 \verb+&+ C_2 & &|& Q\verb+[+P\verb+]+ & & &\\ + &|& C_1 \verb+|+ C_2 & & & & P' &::=& P''\\ + &|& C\verb.+. & T &::=& N & &|& P''\verb+#+P''\\ + &|& C\verb+*+ & &|& C N & & &\\ + &|& C\verb+?+ & & & & P'' &::=& \\ + &|& C_1~C_2 & N &::=& \verb+<*>+ & &|& T\;P''\\ + &|& \verb+!+C & &|& \verb+<+n_1\verb+|+\cdots\verb+|+n_2\verb+>+ & & &\\ + & & & &|& \verb++ & & &\\ +\end{array} +\] +\caption{Syntax of the pattern language. $n$, $n_i$ denote names, $v$ +denotes a string enclosed in single or double quotes} +\end{table} + + \section{Insert Rules} \paragraph{Begin Group:} $\{$ @@ -406,138 +532,8 @@ cursor with \ONODE{}, append $\tadvance$ after the \ONODE{} node \begin{description} - %********* rules that try to express situations completely specified (there is an effective deletion) ****** - - % in the rules below, a token is either an i node, an n node, an o node, an s node or an empty c node. - % an empty c node is either an undefined macro or an empty macro. These c node are handled as they actually were - % tokens (i, n, o, s). - % An important observation is: a sequence of groups with id, in which every group has one and only one child and where - % the last group contains the cursor, is equivalent to the cursor (Is it clear?). For example: - % ...... is equivalent to: - % - - % there is an optional sequence of groups with id, all of which having one and only one child. The last - % child of this sequence (or the root, if the sequence is empty) is a group with id, has as first element a - % token and has as second and last element an eventually empty sequence of groups with id, all of which having - %one and only one child. The last element of this sequence has the cursor as its child. - \item{\verb+(g[@id][^#$]/++\verb+)?g[@id][^(i|n|o|s|c[ ])#$]/(g[@id][^#$]/++\verb+)?cursor+}\\ - replace the whole fragment with the cursor. - - % the cursor is a macro's child. There is an eventually empty sequence of delimiters, which precedes the - % cursor. Before this sequence, there is a p node. This p node has as its last child a token - \item{\verb+c[p[(i|n|o|s|c[ ])$]((i|n|o|s|c[ ])*)#]/cursor+}\\ - remove the cursor and replace the token with the cursor. - - % there is a token, which precedes the cursor. The cursor is child of either a group or a parameter. - % the case where the cursor is child of a group with id, there are no nodes after the cusror and there - % are no nodes preceding the token, is handled by the first rule. - % This rule is duplicated below, but that one could have a wrong syntax. This one has been tested. - \item{\verb+[(i|n|o|c[ ])#]/cursor+}\\ - remove the cursor and replace the token with the cursor. - - % cursor's parent is a script (the cursor is not the base). The script's base is an empty group with id. - % The script is preceded by a token.The script's parent is group or a parameter. - % The case where the script's parent is a group with id and the token is the first child, while - % the script is the last one, is handled some rules later. - \item{\verb+[(i|n|o|s|c[ ])#]/[^(g[@id][ ])#$]/cursor+}\\ - remove the script and replace the token with the cursor. - - % As above, but: we don't care who precedes the script (we do not have to remove it); the script's base - % is still an empty group, but it has no id. - % Ad above, we have to precise that the situation where the script is the only child of a group with id - % is handled some rules later. - \item{\verb+/[^(g[!@id][ ])#$]/curosr+}\\ - replace the script with the cursor. - - % The cursor is the last child of a script. The script's base is token. - \item{\verb+/[^(i|n|o|s|c[ ])#$]/cursor+}\\ - repalce the script with the cursor. - - - %******************************************************************************************************* - %************** rules handling the case in which the cursor has a preceding node *********************** - %******************************************************************************************************* - - %************************* the cursor's parent is a group or a parameter ******************************* - - %we consider an empty macro as a token. An empty macro is either undefined or simply empty. - - %rule handling the case where the cursor has a preceding token and this is the first node of a group with id - \item{\verb+g[@id][^(i||n||o||s||c[^$])#]/cursor+}\\ - remove the cursor and replace the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node with the $\NLDROP$. - \item{\verb+[(i||n||o||s||c[^$])#]/cursor+}\\ - remove the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node. - %it corresponds to the drop_prev_group(false). - \item{\verb+[g#]/cursor+}\\ - remove the cursor and append the $\NLDROP$ to the \G{} node. - %corresponds to the drop_prev_script(false). - \item{\verb+[()#]/cursor+}\\ - remove the cursor and append the $\NLDROP$ to the \SP{} or \SB{} node. - %rules handling the right open macros preceding the cursor. - \item{\verb+[(c[^p[@right-open='1']$])#]/cursor+}\\ - remove the cursor and append the $\NLDROP$ to the \PNODE{} node. - %we don't have a macro of this kind, but we don't know the future... - \item{\verb+[(c[^p[!(@right-open='1')][@left-open='1'][^$]$])#]/cursor+}\\ - remove the cursor and replace the \CNODE node with the $\NLDROP$. - \item{\verb+[(c[^p[!(@right-open='1')][@left-open='1']$])#]/cursor+}\\ - replace the \CNODE{} node with the content of the \PNODE node and replace the cursor with the $\NLDROP$. - %rules handling macro with parameter(s) and preceding the cursor. - \item{\verb+[c#]/cursor+}\\ - remove the cursor and append the $\NLDROP$ to the \CNODE{} node. - %rule handling table with rows preceding the cursor. - \item{\verb+[(table[row[cell[^g$]$]$])#]/cursor+}\\ - remove the cursor and append the $\NLDROP$ to the \G{} node, which is child of the \CELL{} node. - %rule handling tables without rows preceding the cursor. - \item{\verb+[table#]/cursor+}\\ - remove the cursor and append the $\NLDROP$ to the \TABLE{} node - - %********************************* the cursor's parent is a script ****************************** - - %rule handling the case where the script's base is an empty group with id. - \item{\verb+[^(g[@id][^$])#$]/cursor+}\\ - replace the \SP{} or \SB{} node with the $\NLDROP$. - %rule handling the case where the script's base is an empty group without id - \item{\verb+[^(g[^$])#$]/cursor+}\\ - replace the \SP{} or \SB{} node with the cursor. - %rule handling the case where the script's base is something else - \item{\verb+[^*#$]/cursor+}\\ - replace the \SP{} or \SB{} node with it's first child and insert the $\NLDROP$ after it. - - %********************************* the cursor's parent is a macro ******************************** - - %rule handling the case where the preceding node is a parameter - \item{\verb+c[p#]/cursor+}\\ - remove the cursor and append the $\NLDROP$ to the PNODE{} node. - %rule handling the case where the preceding node is a delimiter - \item{\verb+c[()#]/cursor+}\\ - remove the cursor and insert the $\NLDROP$ before the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node. - - %************************************************************************************************* - %*********** rules handling the case in which the cursor has no preceding nodes ****************** - %************************************************************************************************* - - %rule handling the case where the cursor has something else after it - \item{\verb+[#*]/cursor+}\\ - nothing to drop. - - %rules handling the case where the cursor's parent is a p node and the cursor has no nodes after it. - - %rule handling the case where the p node is a left and right open macro's child and the first p node has no elements. - \item{\verb+c[^(p[@left-open='1'][^$])#$]/p[@right-open='1'][^#$]/cursor+}\\ - replace the \CNODE{} with the $\NLDROP$. - %rule handling the case where the p node is a left and right open macro's child and the first p node has elements. - \item{\verb+c[^(p[@left-open='1']#$]/p[@right-open='1'][^#$]/cursor+}\\ - replace the \CNODE{} with the content of the PNODE{} with attribute texttt{left-open='1'} and insert $\NLDROP$ after this. - %rule handling the case where the p node is right open macro's child. - \item{\verb+c[^#$]/p[@right-open='1'][^#$]/cursor+}\\ - replace the \CNODE{} with the $\NLDROP$. - %rule handling the case where the p node is a macro's child and this macro has parameter. - \item{\verb+c/p[^#$]/cursor+}\\ - remove the cursor and insert the $\NLDROP$ before the \PNODE{} node. - - %rule handling the case where the cursor has no nodes after it and its parent is a group with id - \item{\verb+g[@id][^#$]/cursor+}\\ - replace the \G{} node with the $\NLDROP$. + \item{\verb+cursor+}\\ + replace the cursor with the $\NLDROP$. \end{description} @@ -545,171 +541,106 @@ cursor with \ONODE{}, append $\tadvance$ after the \ONODE{} node \begin{description} - %******************************************************************************************************* - %************** rules handling the case in which the cursor has a preceding node *********************** - %******************************************************************************************************* - - %************************* the cursor's parent is a group or a parameter ******************************* - - %this rule is more specific than the one below which handle the case of the cursor preceded by a c node - \item{\verb+[(i||n||o||s||c[^$])#]/cursor+}\\ - remove the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node. - %it corresponds to the drop_prev_group(true). - \item{\verb+[g#]/cursor+}\\ - remove the cursor and append the it to the \G{} node. - %corresponds to the drop_prev_script(true). - \item{\verb+[()#]/cursor+}\\ - remove the cursor and append the $\SLDROP$ to the \SP{} or \SB{} node. - %rules handling the right open macros preceding the cursor. - \item{\verb+[(c[^p[@right-open='1']$])#]/cursor+}\\ - remove the cursor and append the $\SLDROP$ to the \PNODE{} node. - %we don't have a macro of this kind, but we don't know the future... - \item{\verb+[(c[^p[!(@right-open='1')][@left-open='1'][^$]$])#]/cursor+}\\ - remove the cursor and replace the \CNODE node with the cursor. - \item{\verb+[(c[^p[!(@right-open='1')][@left-open='1']$])#]/cursor+}\\ - replace the \CNODE{} node with the content of the \PNODE node. - %rule handling macro with parameter(s) and preceding the cursor. - \item{\verb+[c#]/cursor+}\\ - remove the cursor and append the $\SLDROP$ to the \CNODE{} node. - %rule handling table with rows preceding the cursor. - \item{\verb+[(table[row[cell[^g$]$]$])#]/cursor+}\\ - remove the cursor and append the it to the \G{} node, which is child of the \CELL{} node. - %rule handling tables without rows preceding the cursor. - \item{\verb+[table#]/cursor+}\\ - remove the cursor and append the $\SLDROP$ to the \TABLE{} node - - %********************************* the cursor's parent is a script ****************************** - - %rule handling the case where the script's base is an empty group with id. - \item{\verb+[^(g[@id][^$])#$]/cursor+}\\ - replace the \SP{} or \SB{} node with the \G{} node and insert the cursor after it. - %rule handling the case where the script's base is an empty group without id - \item{\verb+[^(g[^$])#$]/cursor+}\\ - replace the \SP{} or \SB{} node with the cursor. - %rule handling the case where the scrip's base is something else - \item{\verb+[^*#$]/cursor+}\\ - replace the \SP{} or \SB{} node with it's first child and insert the cursor after it. - - %********************************* the cursor's parent is a macro ******************************** - - %rule handling the case where the preceding node is a parameter - \item{\verb+c[p#]/cursor+}\\ - remove the cursor and append the $\SLDROP$ to the PNODE{} node. - %rule handling the case where the preceding node is a delimiter - \item{\verb+c[()#]/cursor+}\\ - remove the cursor and insert the it before the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node. - - %************************************************************************************************* - %*********** rules handling the case in which the cursor has no preceding nodes ****************** - %************************************************************************************************* - - %rule handling the case where the cursor has something else after it - \item{\verb+[#*]/cursor+}\\ - nothing to drop. - - %rules handling the case where the cursor's parent is a p node and the cursor has no nodes after it. - - %rule handling the case where the p node is a left and right open macro's child and the first p node has no elements. - \item{\verb+c[^(p[@left-open='1'][^$])#$]/p[@right-open='1'][^#$]/cursor+}\\ - replace the \CNODE{} with the cursor. - %rule handling the case where the p node is a left and right open macro's child and the first p node has elements. - \item{\verb+c[^(p[@left-open='1']#$]/p[@right-open='1'][^#$]/cursor+}\\ - replace the \CNODE{} with the content of the PNODE{} with attribute texttt{left-open='1'} and insert cursor after this. - %rule handling the case where the p node is right open macro's child. - \item{\verb+c[^#$]/p[@right-open='1'][^#$]/cursor+}\\ - replace the \CNODE{} with the cursor. - %rule handling the case where the p node is a macro's child and this macro has parameter. - \item{\verb+c/p[^#$]/cursor+}\\ - remove the cursor and insert the $\SLDROP$ before the \PNODE{} node. - - %rule handling the case where the cursor has no nodes after it and its parent is a group with id - \item{\verb+g[@id][^#$]/cursor+}\\ - replace the \G{} node with the cursor. + \item{\verb+cursor+}\\ + replace the cursor with the $\SLDROP$. \end{description} \section{Right Drop Rules} +\begin{description} + + \item{\verb+cursor+}\\ + replace the cursor with the $\RDROP$. + +\end{description} + \section{$\varepsilon$-rules} \paragraph{Nromal Left Drop} \begin{description} + \item{\verb+math/g[^#]/+$\NLDROP$}\\ + repalce the $\NLDROP$ with the cursor. - %********************************************* \NLDROP has a preceding node ********************************************* - - %rule handling the case where the preceding node is a token or an empty macro. - \item{\verb+*[(i||n||o||s||c[^$])#]/+$\NLDROP$}\\ - remove the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or the \CNODE{} node. - %rule handling the case where the preceding node is a group - \item{\verb+*[g#]/+$\NLDROP$}\\ - remove the $\NLDROP$ append it to the \G{} node. - %rule handling the case where the preceding node is a script - \item{\verb+*[#]/+$\NLDROP$}\\ - remove the $\NLDROP$ append it to the \SP{} or \SB{} node. - %rule handling the case where the preceding node is an open-right macro - \item{\verb+*[(c[^p[@right-open='1']$])#]/+$\NLDROP$}\\ - remove the $\NLDROP$ and append it to the \PNODE{} node. - %rule handling the case where the preceding node is a left-open macro (and not right-open), and the p node of this macro has no children - \item{\verb+*[(c[^p[!(@right-open='1')][@left-open='1'][^$]$])#]/+$\NLDROP$}\\ - remove the \CNODE node. - %rule handling the case where the preceding node is a left-open macro (and not right-open), and the p node of this macro has children - \item{\verb+*[(c[^p[!(@right-open='1')][@left-open='1']$])#]/+$\NLDROP$}\\ - replace the \CNODE{} node with the content of the \PNODE{} node. - %rule handling the case where the preceding node is a macro with parameter(s) - \item{\verb+*[c#]/+$\NLDROP$}\\ - remove the $\NLDROP$ and append it to the \CNODE{} node. - - %rule handling the deletion of primes in superscript - - %there are more than one prime in the phantom group - \item{\verb+sp[!(@id)][*#$]/g[!(@id)][o#]/+$\NLDROP$}\\ - remove the \ONODE node, remove the $\NLDROP$ and insert the cursor after the \SP{} node. - %there is one and only one prime in the phantom group - \item{\verb+sp[!(@id)][*#$]/g[!(@id)][^o#]/+$\NLDROP$}\\ - replace the \SP{} node with it's first child and insert the cursor after it - - %*** rules handling the case where the \NLDROP's parent is a macro (with parameter) *** - - %the node preceding the \NLDROP is a delimiter - \item{\verb+c[()#]/+$\NLDROP$}\\ - remove the $\NLDROP$ and insert it before the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node. - %the node preceding the \NLDROP is parameter - \item{\verb+c[p#]/+$\NLDROP$}\\ - remove the $\NLDROP$ and append it to the \PNODE{} node. + %************************************************************************************** + %****************************** epsilon-rules with \NLDROP **************************** + %************************************************************************************** - %********************************************* the \NLDROP has no preceding nodes **************************************************** + %************** \NLDROP has neither preceding nor following nodes ******************** - %if the \NLDROP has no preceding nodes, but has following nodes... - \item{\verb+[#*]/+$\NLDROP$}\\ + \item{\verb+math[^#$]/+$\NLDROP$}\\ replace the $\NLDROP$ with the cursor. - %rule handling the case where the $\NLDROP$ is the only child of a group with id. - \item{\verb+g[@id][^#$]/+$\NLDROP$}\\ + + \item{\verb+g[^#$]/+$\NLDROP$}\\ replace the \G{} node with the $\NLDROP$. - %\NLDROP is the first child of a macro with no inserted parameter - \item{\verb+c[!(p[*])][^#]/+$\NLDROP$}\\ - replace the \CNODE{} with the $\NLDROP$. - %\NLDROP is the first child of a macro with some inserted parameter - \item{\verb+c[p[*]][^#]/+$\NLDROP$}\\ - put the cursor in the first \PNODE{} node. - - - %**** rule handling the case where the \NLDROP has a preceding token or a preceding empty macro, which is either undefined or empty **** - - %%the \NLDROP's parent is neither a macro nor a phantom group which in turn is a sp's child and the token is a prime - %\item{\verb+*[(i||n||o||s||c[^$])#]/+$\NLDROP$}\\ - %replace the \INODE{}, \NNODE{}, \SNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node with the cursor and remove the $\NLDROP$. - %%the \NLDROP's parent is a macro with argument - %\item{\verb+c[((!(p[@left-open='1']))&(!(p[@right-open+'1'])))][(i||n||o||s||c[^$])#]/+$\NLDROP$}\\ - %remove the $\NLDROP$ and insert it before the \INODE{}, \NNODE{}, \ONODE{}, \SNODE{} or \CNODE{} node. - %%rule handling the case where the \NLDROP's parent is a phantom group, which in turn is a sp's child - %\item{\verb+sp[!(@id)][^*#$]/g[!(@id)][o#$]/+$\NLDROP$}\\ - %remove the \ONODE{} node - - %********************************* rules handling the case where the \NLDROP has a preceding group ************************************** - \item{\verb+*[g#]/+$\NLDROP$}\\ - remove the $\NLDROP$ and append it to the \G{} node. + + % this rule is overridden by the two ones below + \item{\verb+c/p[^#$]/+$\NLDROP$}\\ + remove the $\NLDROP$ and insert it before the \PNODE{} node. + + \item{\verb+c[p[@left-open='1'][*]#$]/p[@right-open='1'][^#$]/+$\NLDROP$}\\ + replace the \CNODE{} node with the content of the first \PNODE{} node and insert the $\NLDROP$ after this content + + \item{\verb+c[p[@left-open='1'][!*]#$]/p[@right-open='1'][^#$]/+$\NLDROP$}\\ + replace the \CNODE{} node with the $\NLDROP$. + + \item{\verb+c[^#][!p(*)]/+$\NLDROP$}\\ + replace the \CNODE{} node with the $\NLDROP$. + + \item{\verb+cell[^#$]/+$\NLDROP$}\\ + replace the cell with the $\NLDROP_n$. + + \item{\verb+table[^#$]/+$\NLDROP$}\\ + replace the \TABLE{} node with the $\NLDROP$. + + %************************* \NLDROP has at least one preceding node ********************* + + % general rules + + % this rule should also handles the case where the \NLDROP is the third (and last) child of a script. + \item{\verb+*[*#]/+$\NLDROP$}\\ + remove the $\NLDROP$ and append it as the last child of its ex preceding brother. + + % this rule overrides the one above + \item{\verb+*[(i|n|o|s|c[!*])#]/+$\NLDROP$}\\ + remove the $\NLDROP$ and replace the token with the $\NLDROP_n$. + + % special rules + + \item{\verb+[^*#$]+/$\NLDROP$}\\ + replace the script node with its first child and insert the $\NLDROP$ after it. + + % this rule overrides the one above. + \item{\verb+[^g[!@id][!*]#$]/+$\NLDROP$}\\ + replace the script with the cursor. + + % this rule overrides the one above + \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']++\verb+o[@name='prime']$]]#]/+$\NLDROP$}\\ + remove the last \ONODE{} node and replace the $\NLDROP$ with the cursor.%$\NLDROP_n$. + + \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']$]]#]/+$\NLDROP$}\\ + replace the script with its first child and replace the $\NLDROP$ with the cursor.%$\NLDROP_n$. + + \item{\verb+c[(i|n|o|s|c[!*])#]/+$\NLDROP$}\\ + move the $\NLDROP$ before the delimiter. + + % this rule is true for both right-open and parameterized macros. + \item{\verb+c[p#]/+$\NLDROP$}\\ + move the $\NLDROP$ into the \PNODE{} node. + + %**************** \NLDROP has no preceding nodes, but has following nodes ************** + + % general rule + \item{\verb+*[^#*]/+$\NLDROP$}\\ + remove the $\NLDROP$ and insert it before its parent. + + % special rules + + % this rule is applicable to all macros. + \item{\verb+c[^#][p[*]]/+$\NLDROP$}\\ + remove the $\NLDROP$ and insert it before the \CNODE{} node. \end{description} @@ -717,7 +648,160 @@ cursor with \ONODE{}, append $\tadvance$ after the \ONODE{} node \begin{description} - \item{special left}\\ + %******************************************************************************************************** + %************************************ epsilon-rules with \SLDROP **************************************** + %******************************************************************************************************** + + \item{\verb+math/+$\SLDROP$}\\ + replace the $\SLDROP$ with the cursor. + + \item{\verb+math/g[^#]/+$\NLDROP$}\\ + replace the $\NLDROP$ with the cursor. + + %************************ \SLDROP has neither preceding nor following nodes ***************************** + + \item{\verb+g[^#$]/+$\SLDROP$}\\ + replace the \G{} node with the cursor. + + \item{\verb+c[p[@left-open='1'][*]#$]/p[@right-open='1'][^#$]/+$\SLDROP$}\\ + replace the \CNODE{} node with the content of the first \PNODE{} node and insert the cursor after this content + + \item{\verb+c[p[@left-open='1'][!*]#$]/p[@right-open='1'][^#$]/+$\SLDROP$}\\ + replace the \CNODE{} node with the cursor. + + \item{\verb+c/p[^#$]/+$\SLDROP$}\\ + remove the $\SLDROP$ and insert it before the \PNODE{} node. + + \item{\verb+c[^#][!p(*)]/+$\SLDROP$}\\ + replace the \CNODE{} node with the cursor. + + \item{\verb+cell[^#$]/+$\SLDROP$}\\ + replace the cell with the $\NLDROP_n$. + + \item{\verb+table[^#$]/+$\SLDROP$}\\ + replace the \TABLE{} node with the cursor. + + %*********************** \SLDROP has at least one preceding node *********************************** + + \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']++\verb+o[@name='prime']$]]#]/+$\SLDROP$}\\ + remove the last \ONODE{} node and replace the $\SLDROP$ with the cursor. + + \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']$]]#]/+$\SLDROP$}\\ + replace the script with its first child and replace the $\SLDROP$ with the cursor.%$\NLDROP_n$. + + \item{\verb+[^g[!@id][!*]#$]/+$\SLDROP$}\\ + replace the script with the cursor. + + % this rule is overridden by the three rules above. + \item{\verb+[^*#$]+/$\SLDROP$}\\ + replace the script node with its first child and insert the cursor after it. + + \item{\verb+c[(i|n|o|s|c[!*])#]/+$\SLDROP$}\\ + remove the $\SLDROP$ and insert the cursor before the delimiter. + + \item{\verb+c[p#(i|n|o|s|c[!*])]/+$\SLDROP$}\\ + remove the $\SLDROP$ and insert the cursor into the \PNODE{} node. + + \item{\verb+c[p[@right-open='1']#]+}\\ + remove the $\SLDROP$ and append the curor as last child of the \PNODE{} node. + + % this rule is overridden by the two ones above. + \item{\verb+c[p#]/+$\SLDROP$}\\ + move the $\SLDROP$ into the \PNODE{} node. + + \item{\verb+*[(i|n|o|s|c[!*])#]/+$\SLDROP$}\\ + remove the $\SLDROP$ and replace the token with the cursor. + + \item{\verb+*[table#]/+$\SLDROP$}\\ + remove the $\SLDROP$ and append the $\NLDROP_n$ as the last child of the \TABLE{} node. + + \item{\verb+*[c#]/+$\SLDROP$}\\ + move the $\SLDROP$ into the \CNODE{} node. + + \item{\verb+*[g#]/+$\SLDROP$}\\ + remove the $\SLDROP$ and append the cursor as the last child of the \G{} node. + + %********** \SLDROP has no preceding node, but has following ones ************** + + \item{\verb+c[^#p][p(*)]/+$\SLDROP$}\\ + remove the $\SLDROP$ and insert the cursor before the \CNODE{} node. + + % general rule + \item{\verb+*[^#*]/+$\SLDROP$}\\ + remove the $\SLDROP$ and insert the cursor before its parent. + +\end{description} + +\paragraph{Normalize Left Drop} + +\begin{description} + + %**************************************************************************************** + %***************************** epsilon-rules with \NLDROP_n ***************************** + %**************************************************************************************** + + \item{\verb+*[*#]/+$\NLDROP_n$}\\ + replace the $\NLDROP_n$ with the cursor. + + \item{\verb+row[cell#]/+$\NLDROP_n$}\\ + remove the $\NLDROP_n$ and append the cursor as the last child of the \CELL{} node. + + \item{\verb+row[^#$]/+$\NLDROP_n$}\\ + replace the \ROW{} node with the $\NLDROP_n$ + + \item{\verb+table[row#]/+$\NLDROP_n$}\\ + remove the $\NLDROP_n$ and append it as last child of the \ROW{} node. + + \item{\verb+table[^#$]/+$\NLDROP_n$}\\ + replace the \TABLE{} with the cursor.%$\NLDROP_n$. + + \item{\verb+g[@id][^#$]/+$\NLDROP_n$}\\ + replace the \G{} node with the $\NLDROP_n$. + + \item{$\NLDROP_n$}\\ + replace the $\NLDROP_n$ with the cursor. + +\end{description} + +\paragraph{Right Drop} + +\begin{description} + + %************************* \RDROP has at least a following node **************************************** + + \item{\verb+c[#(i|n|o|s|c[!*])]/+$\RDROP$}\\ + remove the $\RDROP$ and append it after the delimiter + + \item{\verb+*[#(i|n|o|s|c[!*])]/+$\RDROP$}\\ + remove the token and replace the $\RDROP$ with the cursor $\RDROP_n$. + + % this rule is overridden by those ones above. + \item{\verb+*[#*]/+$\RDROP$}\\ + remove the $\RDROP$ and append it as the first child of the following node. + + %************************** \RDROP has neither following nor preceding nodes ****************************** + + \item{\verb+c[#$][!p[*]]/+$\RDROP$}\\ + replace the \CNODE{} with the $\RDROP$. + + \item{\verb+p[^#$]/+$\RDROP$}\\ + move the $\RDROP$ after the \PNODE{} node. + + \item{\verb+g[^#$]/+$\RDROP$}\\ + replace the \G{} node with the $\RDROP$. + +\end{description} + +\paragraph{Normalize Right Drop} + +\begin{description} + + % at the moment it's the only rule, defined for this symbol. + \item{\verb+g[@id][^#$]/+$\RDROP_n$}\\ + replace the \G{} node with the $\RDROP_n$. + + \item{$\RDROP_n$}\\ + replace the $\RDROP$ with the cursor. \end{description} @@ -758,4 +842,64 @@ cursor with \ONODE{}, append $\tadvance$ after the \ONODE{} node % g[@id][^#$]/cursor <- cursor % (!g[@id][^#$])[A#B]/(g[@id][^#$]/)+cursor <- (!g[@id][^#$])[A#B]/cursor +\clearpage +\appendix +\section{Semantics of the Regular Context Language} + +\newcommand{\CSEM}[2]{\mathcal{C}\llbracket#1\rrbracket#2} +\newcommand{\QSEM}[2]{\mathcal{Q}\llbracket#1\rrbracket#2} +\newcommand{\TSEMUP}[2]{\mathcal{T}^\uparrow\llbracket#1\rrbracket#2} +\newcommand{\TSEMDOWN}[2]{\mathcal{T}_\downarrow\llbracket#1\rrbracket#2} +\newcommand{\NSEM}[2]{\mathcal{N}\llbracket#1\rrbracket#2} +\newcommand{\PSEM}[2]{\mathcal{P}\llbracket#1\rrbracket#2} +\newcommand{\PPSEM}[2]{\mathcal{P'}\llbracket#1\rrbracket(#2)} +\newcommand{\PARENT}[1]{\mathit{parent}(#1)} +\newcommand{\CHILDREN}[1]{\mathit{children}(#1)} +\newcommand{\ANCESTORS}[1]{\mathit{ancestors}(#1)} +\newcommand{\DESCENDANTS}[1]{\mathit{descendants}(#1)} +\newcommand{\HASATTRIBUTE}[2]{\mathit{hasAttribute}(#1,#2)} +\newcommand{\HASNOATTRIBUTE}[2]{\mathit{hasNoAttribute}(#1,#2)} +\newcommand{\ATTRIBUTE}[2]{\mathit{attribute}(#1,#2)} +\newcommand{\ISELEMENT}[1]{\mathit{isElement}(#1)} +\newcommand{\NAME}[1]{\mathit{name}(#1)} +\newcommand{\PREV}[1]{\mathit{prev}(#1)} +\newcommand{\NEXT}[1]{\mathit{next}(#1)} + +\[ +\begin{array}{rcl} + \CSEM{.}{x} &=& \{x\}\\ + \CSEM{..}{x} &=& \PARENT{x}\\ + \CSEM{/}{x} &=& \CHILDREN{x}\\ + \CSEM{q}{x} &=& \{x_1\mid x_1\in\{x\} \wedge \QSEM{q}{x_1}\}\\ + \CSEM{(c)}{x} &=& \CSEM{c}{x}\\ + \CSEM{\alpha(c)}{x} &=& \CSEM{c}{x}, \mbox{bind $\alpha$ to $x$}\\ + \CSEM{c_1\&c_2}{x} &=& \CSEM{c_1}{x} \cap \CSEM{c_2}{x}\\ + \CSEM{c_1\mid c_2}{x} &=& \CSEM{c_1}{x} \cup \CSEM{c_2}{x}\\ + \CSEM{c+}{x} &=& \CSEM{c}{x} \cup \CSEM{c+}{\CSEM{c}{x}}\\ + \CSEM{c?}{x} &=& \CSEM{.\mid c}{x}\\ + \CSEM{c*}{x} &=& \CSEM{{c+}?}{x}\\ + \CSEM{c_1\;c_2}{x} &=& \CSEM{c_2}{\CSEM{c_1}{x}}\\[3ex] + \QSEM{\langle*\rangle}{x} &=& \ISELEMENT{x}\\ + \QSEM{\langle!*\rangle}{x} &=& \neg\QSEM{\langle*\rangle}{x}\\ + \QSEM{\langle n_1\mid\cdots\mid n_k\rangle}{x} &=& \exists i\in\{1,\dots,k\}:\NAME{x}=n_i\\ + \QSEM{\langle !n_1\mid\cdots\mid n_k\rangle}{x} &=& \neg\QSEM{\langle n_1\mid\cdots\mid n_k\rangle}{x}\\ + \QSEM{q[@n]}{x} &=& \QSEM{q}{x} \wedge \HASATTRIBUTE{x}{n}\\ + \QSEM{q[!@n]}{x} &=& \QSEM{q}{x} \wedge \HASNOATTRIBUTE{x}{n}\\ + \QSEM{q[@n=v]}{x} &=& \QSEM{q}{x} \wedge \ATTRIBUTE{x}{n}= v\\ + \QSEM{q[!@n=v]}{x} &=& \QSEM{q}{x} \wedge \ATTRIBUTE{x}{n}\ne v\\ + \QSEM{q[p]}{x} &=& \QSEM{q}{x} \wedge \PSEM{p}{x}\\ + \QSEM{q[!p]}{x} &=& \QSEM{q}{x} \wedge \neg\PSEM{p}{x}\\[3ex] + \PSEM{p_1\#p_2}{x} &=& \PPSEM{p_1}{*,x}\wedge\PPSEM{p_2}{x,*}\\ + \PSEM{\cent p_1\#p_2}{x} &=& \PPSEM{p_1}{\cent,x}\wedge\PPSEM{p_2}{x,*}\\ + \PSEM{p_1\#p_2\$}{x} &=& \PPSEM{p_1}{*,x}\wedge\PPSEM{p_2}{x,\$}\\ + \PSEM{\cent p_1\#p_2\$}{x} &=& \PPSEM{p_1}{\cent,x}\wedge\PPSEM{p_2}{x,\$}\\[3ex] + \PPSEM{}{*,x} &=& \mathit{true}\\ + \PPSEM{}{\cent,x} &=& \PREV{x}=\emptyset\\ + \PPSEM{}{x,*} &=& \mathit{true}\\ + \PPSEM{}{x,\$} &=& \NEXT{x}=\emptyset\\ + \PPSEM{p\;c}{\alpha,x} &=& \CSEM{c}{\PREV{x}}\ne\emptyset\wedge\PPSEM{p}{\alpha,\PREV{x}}\\ + \PPSEM{c\;p}{x,\alpha} &=& \CSEM{c}{\NEXT{x}}\ne\emptyset\wedge\PPSEM{p}{\NEXT{x},\alpha}\\ +\end{array} +\] + \end{document}