1 \documentclass[10pt]{article}
10 \title{\EdiTeX: a MathML Editor Based on \TeX{} Syntax\\\small Description and Formal Specification}
11 \author{Paolo Marinelli\\Luca Padovani\\\small\{{\tt pmarinel},{\tt lpadovan}\}{\tt @cs.unibo.it}\\\small Department of Computer Science\\\small University of Bologna}
14 \newcommand{\EdiTeX}{Edi\TeX}
16 \newcommand{\tmap}[1]{\llbracket#1\rrbracket}
17 \newcommand{\tadvance}{\vartriangle}
18 \newcommand{\tnext}{\rhd}
19 \newcommand{\G}{\texttt{g}}
20 \newcommand{\PNODE}{\texttt{p}}
21 \newcommand{\SNODE}{\texttt{s}}
22 \newcommand{\INODE}{\texttt{i}}
23 \newcommand{\NNODE}{\texttt{n}}
24 \newcommand{\ONODE}{\texttt{o}}
25 \newcommand{\CNODE}{\texttt{c}}
26 \newcommand{\TABLE}{\texttt{table}}
27 \newcommand{\SP}{\texttt{sp}}
28 \newcommand{\SB}{\texttt{sb}}
29 \newcommand{\CELL}{\texttt{cell}}
30 \newcommand{\ROW}{\texttt{row}}
31 \newcommand{\SLDROP}{\blacktriangleleft}
32 \newcommand{\NLDROP}{\vartriangleleft}
33 \newcommand{\RDROP}{\vartriangleright}
39 \section{Introduction}
41 MathML~\cite{MathML1,MathML2,MathML2E} is an XML application for the
42 representation of mathematical expressions. As most XML applications,
43 MathML is unsuitable to be hand-written, except for the simplest
44 cases, because of its verbosity. In fact, the MathML specification
45 explicitly states that
47 ``While MathML is human-readable, it is anticipated that, in all but
48 the simplest cases, authors will use equation editors, conversion
49 programs, and other specialized software tools to generate MathML''
52 The statement about human readability of MathML is already too strong,
53 as the large number of mathematical symbols, operators, and
54 diacritical marks that are used in mathematical notation cause MathML
55 documents to make extensive use of Unicode characters that typically
56 are not in the ``visible'' range of common text editors. Such
57 characters may appear as entity references, whose name indicates
58 somehow the kind of symbol used, or character references or they are
59 directly encoded in the document encoding scheme (for instance,
62 It is thus obvious that authoring MathML documents assumes the
63 assistance of dedicated tools. As of today, such tools can be
64 classified into two main categories:
66 \item WYSIWYG (What You See Is What You Get) editors that allow the
67 author to see the formatted document on the screen as it is
69 \item conversion tools that generate MathML markup from different
70 sources, typically other markup languages for scientific
71 documents, such as \TeX.
74 While the former tools are certainly more appealing, especially to the
75 unexperienced user, as they give a direct visual feedback, the
76 existance of tools in the second category takes into account the large
77 availability of existing documents in \TeX{} format, and also the fact
78 that experienced or ``lazy'' users may continue to prefer the use of a
79 markup language other than MathML for editing, and generate MathML
80 only as a final step of the authoring process. The ``laziness'' is not
81 really intended as a way of being reluctant towards a new technology,
82 but rather as a justified convincement that WYSIWYG editors are ``nice
83 to look at'' but after all they may slow down the authoring process.
84 WYSIWYG editors often involve the use of menus, palettes of symbols,
85 and, in general, an extensive use of the pointing device (the mouse)
86 for completing most operations. The use of shortcuts is of little
87 help, as it implies very soon a challenging exercise for the fingers
88 and the mind. Moreover, authors \emph{cannot improve} their authoring
89 speed with time. On the other side, the gap between the syntax of any
90 markup language for mathematics and mathematical notation may be
91 relevant, especially for large, non-trivial formulas and authoring is
92 a re-iterated process in which the author repeadtedly types the markup
93 in the editor, compiles, and looks at the result inside a pre-viewer.
95 \EdiTeX{} tries to synthesize the ``best of both worlds'' in a single
96 tool. The basic idea is that of creating a WYSIWYG editor in which
97 editing is achieved by typing \TeX{} markup as the author would do in
98 a text editor. The \TeX{} markup is tokenized and parsed on-the-fly
99 and a corresponding MathML representation is created and
100 displayed. This way, the author can see the rendered document as it
101 changes. The advantages of this approach can be summarized as follows:
103 \item the document is rendered concurrently with the editing, the
104 user has an immediate feedback hence it is easier to spot errors;
105 \item the author types in a concrete (and likely familiar) syntax
106 improving the editing speed;
107 \item the usual WYSIWYG mechanisms are still available. In
108 particular, it is possible to select \emph{visually} a fragment of
109 the document that needs re-editing, or that was left behind for
113 \paragraph{The Name of the Game:} there is no reference to MathML in
114 the name ``\EdiTeX.'' In fact, the architecture of the editor is not
115 tied to MathML markup. Although we focus on MathML editing, by
116 changing a completely modularized component of the editor it is
117 virtually possible to generate any other markup language.
119 \paragraph{Acknowledgments.} Stephen M. Watt and Igor Rodionov for
120 their work on the \TeX{} to MathML conversion tool; Stan Devitt for an
121 illuminating discussion about the architecture of \TeX{} to XML
122 conversion tools; Claudio Sacerdoti Coen for the valuable feedback and
123 uncountable bug reports.
125 \section{Architecture}
127 \section{Customization}
129 \subsection{Short and Long Identifiers}
131 \subsection{The Dictionary}
133 \subsection{Stylesheets and Trasformations}
135 \subsection{Rendering}
137 \section{XML Representation of \TeX{} Markup}
141 The following tokens are defined:
143 \begin{tabular}{lllp{0.5\textwidth}}
144 \textbf{\TeX{}} & \textbf{Notation} & \textbf{Node} & \textbf{Description} \\
146 \verb+{+ & $\mathrm{begin}$ & \texttt{g} & Beginning of a group \\
147 \verb+}+ & $\mathrm{end}$ & & End of a group \\
148 \verb+$+ & $\$$ & \texttt{math} & Math shift \\ %$ \\
150 \verb+#+$i$ & $p(i)$ & \texttt{p} & Parameter \\
151 \verb+^+ & $\uparrow$ & \texttt{sp} & Superscript \\
152 \verb+_+ & $\downarrow$ & \texttt{sb} & Subscript \\
153 & $\square$ & & Space-like character that can be ignored \\
154 & $s$ & \texttt{s} & Space-like character that may be significant \\
155 letter & $i(v)$ & \texttt{i} & Identifier $v$ \\
156 digit & $n(v)$ & \texttt{n} & Number $v$ \\
157 other & $o(v)$ & \texttt{o} & Other character or operator $v$ \\
158 \verb+~+ & $\sim$ & & Active character \\
159 \verb+%+ & $\%$ & & Comment \\
160 control & $c(v)\langle\alpha_1,\dots,\alpha_n\rangle$ & \texttt{c} &
161 Control sequence $v$ that expects the $\alpha_1,\dots,\alpha_n$ sequence of tokens. \\
162 backspace & $\vartriangleleft$ & & \\
163 backspace & $\blacktriangleleft$ & & \\
166 %% Some tokens are mapped directly into nodes of the TML tree. The following functions shows
169 \begin{tabular}{r@{\quad$=$\quad}l}
170 $\tmap{\{}$ & \verb+g+ \\
171 $\tmap{p(i)}$ & \verb+p[@index=+$i$\verb+]+ \\
172 $\tmap{p_l(i)}$ & \verb+p[@index=+$i$\verb+][@left-open='1']+ \\
173 $\tmap{p_r(i)}$ & \verb+p[@index=+$i$\verb+][@right-open='1']+ \\
174 $\tmap{s}$ & \verb+s+ \\
175 $\tmap{\uparrow}$ & \verb+sp+ \\
176 $\tmap{\downarrow}$ & \verb+sb+ \\
177 $\tmap{i(v)}$ & \verb+i[@value=+$v$\verb+]+ \\
178 $\tmap{n(v)}$ & \verb+n[@value=+$v$\verb+]+ \\
179 $\tmap{o(v)}$ & \verb+o[@value=+$v$\verb+]+ \\
180 $\tmap{c(v)\langle\alpha_1,\dots,\alpha_n\rangle}$ & \verb+c[@name=+$v$\verb+][^+$\tmap{\alpha_1}\cdots\tmap{\alpha_n}$\verb+$]+\\
184 \section{Description and Semantics of the Pattern Language}
187 %% \mathit{NodeTest} & ::= & \mathtt{*} \\
188 %% & | & \mathit{ElementType} \\
189 %% & | & \mathtt{<}~\mathit{ElementTypePattern}~\mathtt{>} \\[1ex]
190 %% \mathit{ElementTypePattern} & ::= & \mathtt{*} \\
191 %% & | & \mathit{ElementType}~(\mathtt{|}~\mathit{ElementType})^* \\
192 %% & | & \mathtt{!}\mathit{ElementType}~(\mathtt{|}~\mathit{ElementType})^*\\[1ex]
193 %% \mathit{NodePattern} & ::= & \mathit{NodeTest}~\mathit{AttributeQualifier}^*\\[1ex]
194 %% \mathit{AttributeQualifier} & ::= & \mathtt{[@}\mathit{AttributeTest}\mathtt{]}\\
195 %% & | & \mathtt{[!@}\mathit{AttributeTest}\mathtt{]}\\[1ex]
196 %% \mathit{AttributeTest} & ::= & \mathit{AttributeName} \\
197 %% & | & \mathit{AttributeName}\mathtt{='}\mathit{Text}\mathtt{'}
202 \begin{array}{rcl@{\hspace{3em}}rcl@{\hspace{3em}}rcl}
203 C &::=& . & Q &::=& \langle*\rangle & P &::=& P'\#P' \\
204 &|& .. & &|& \langle!*\rangle & &|& \cent P'\#P'\\
205 &|& / & &|& \langle n_1\mid\cdots\mid n_k\rangle & &|& P'\#P'\$\\%$
206 &|& Q & &|& \langle!n_1\mid\cdots\mid n_k\rangle & &|& \cent P'\#P'\$\\%$
207 &|& (C) & &|& Q[@n] & & &\\
208 &|& \{C:\Gamma\} & &|& Q[!@n] & P' &::=& \\
209 &|& C\&C & &|& Q[@n=v] & &|& C\;P'\\
210 &|& C\mid C & &|& Q[!@n=v] & & &\\
211 &|& C+ & &|& Q[P] & & &\\
212 &|& C? & &|& Q[!P] & & &\\
214 &|& C\;C & & & & & &\\
218 \caption{Syntax of the regular context language. $n$, $n_i$ denote
219 names, $v$ denotes a string enclosed in single or double quotes}
223 \section{Insert Rules}
225 \paragraph{Begin Group:} $\{$
228 \item{\verb+table/cursor+}\\
229 create a \texttt{row} node, create a \texttt{cell} node, create a \texttt{g} node,
230 append the cursor to the \texttt{g} node, append the \texttt{g} node to the \texttt{cell} node,
231 append the \texttt{cell} node to the \texttt{row} node, append the \texttt{row} node to the
233 \item{\verb+cursor+} \\ create a \texttt{g} node, replace the cursor with the new \texttt{g} node,
234 append the cursor to the new \texttt{g} node
237 % CASE: c/g[!@id]/cursor
244 % CASE: c[@table='1']/cursor
246 % create a g node with id, replace the cursor with the fresh g and append
247 % the cursor as only child of it
249 \paragraph{End Group:} $\}$
252 \item{\verb+g[@id]/cursor+}\\
253 remove the cursor, put $\tadvance$ after the \texttt{g} node
254 \item{\verb+row/cell/g/cursor+}\\
255 remove the cursor, put $\tadvance$ after the \texttt{row} node
256 \item{\verb+math/g[!@id]/cursor+}\\
258 \item{\verb+cursor+}\\
262 \paragraph{Math Shift:} $\$$
265 \item{\verb+tex/cursor+}\\
266 create a \texttt{math} node, create a \texttt{g} node, append the \texttt{g} node
267 as child of the \texttt{math} node, append the cursor as child of the \texttt{g} node
268 \item{\verb+math[@display='1']/g[!@id][*#]/cursor+}\\
269 append the cursor as last child of the \texttt{math} node
270 \item{\verb+math/g[!@id][*#]/cursor+}\\
272 \item{\verb+math[!display='1']/g[!@id]/cursor+}\\
273 set \verb+display='1'+ in the \texttt{math} node
274 \item{\verb+math/g[!@id]+}\\
275 append the cursor after the \texttt{math} node
276 \item{\verb+math/cursor+}\\
278 \item{\verb+cursor+} \\
284 % create a math node. create a g node. append g as child of math.
285 % append the cursor as child of g
286 % CASE: math[@display='1']/g[!@id][*#]/cursor
287 % append the cursor as last child of math
288 % CASE: math/g[!@id][*#]/cursor
289 % remove the cursor. Editing is done
290 % CASE: math[!display='1']/g[!@id]/cursor
291 % set the display attribute to '1'
293 % append the cursor after math (?)
295 % remove the cursor. Editing is done
299 \paragraph{Align:} $\&$
302 \item{\verb+g[@id]/cursor+}\\
303 create a \texttt{row} node, create a \texttt{cell} node, create a \texttt{g} node,
304 append the cursor to the new \texttt{g} node, append the \texttt{cell} node to the
305 the \texttt{row} node ?
306 \item{\verb+row/cell/g/cursor+}\\
307 create the \texttt{g} node, create the \texttt{cell} node, append the cursor
308 as child of the new \texttt{g} node, append the new \texttt{g} node to the new
309 \texttt{cell} node after the old \texttt{cell} node
310 \item{\verb+cursor+}\\
315 % CASE: g[@id]/cursor
316 % create a row node. create a cell node. create a g node. append the
317 % cursor to g, append the g to cell, append the cell to row, ???
318 % CASE: row/cell/g/cursor
319 % create a g node. create a cell node. appent the cursor to g,
320 % append the g to cell, insert the new cell after the existing cell
324 \paragraph{End-of-line:}
329 \paragraph{Parameter:} $p(i)$
333 \paragraph{Superscript:} $\uparrow$
336 \item{\verb+<g|p>[^#]/cursor+}\\
337 create a \SP{} node, create a \G{} node, replace the cursor with the \SP{} node,
338 append the \G{} node as first child of the \SP{} node, append the cursor as last
339 child of the \SP{} node
340 \item{\verb+<g|p>[*#]/cursor+}\\
341 create a \SP{} node, replace \texttt{*} with the \SP{} node, append \texttt{*} to
342 the \SP{} node, append cursor to the \SP{} node
343 \item{\verb+sp[^*#$][!@over='1']/cursor+}\\ %$
344 set \verb+over='1'+ in the \SP{} node
345 \item{\verb+sp[^*#$][@over='1']/cursor+}\\ %$
347 \item{\verb+cursor+}\\
352 % create sp node. create g node, replace cursor with sp, append g to sp, append cursor to sp
354 % create sp node, replace * with sp, append * to sp, append cursor to sp
355 % CASE: sp[^*#$][!@over='1']/cursor
356 % set over='1' in sp node
357 % CASE: sp[^*#$][@over='1']/cursor
362 \paragraph{Subscript:} $\downarrow$
365 \item{\verb+<g|p>[^#]/cursor+}\\
366 create a \SB{} node, create a \G{} node, replace the cursor with the \SB{} node,
367 append the \G{} node as first child of the \SB{} node, append the cursor as last
368 child of the \SB{} node
369 \item{\verb+<g|p>[*#]/cursor+}\\
370 create a \SB{} node, replace \texttt{*} with the \SB{} node, append \texttt{*} to
371 the \SB{} node, append cursor to the \SB{} node
372 \item{\verb+sb[^*#$][!@under='1']/cursor+}\\ %$
373 set \verb+under='1'+ in the \SB{} node
374 \item{\verb+sb[^*#$][@under='1']/cursor+}\\ %$
376 \item{\verb+cursor+}\\
381 % create sb node. create g node, replace cursor with sb, append g to sb, append cursor to sb
383 % create sb node, replace * with sb, append * to sb, append cursor to sb
384 % CASE: sb[^*#$][!@under='1']/cursor
385 % set over='1' in sb node
386 % CASE: sb[^*#$][@under='1']/cursor
391 \paragraph{Ignorable space:} $\square$
393 % do_ignorable_space:
396 \paragraph{Space:} $s$
399 \item{\verb+cursor+}\\
400 create \SNODE{} node, replace cursor with the \SNODE{} node, append
401 $\tadvance$ after \SNODE{} node
405 % create s node, replace cursor with s, append \advance after s
407 \paragraph{Identifier:} $i(v)$
410 \item{\verb+cursor+}\\
411 create an \INODE{}, set \verb+value=+$v$ in the \INODE{}, replace
412 cursor with \INODE{}, append $\tadvance$ after the \INODE{} node
416 % create i node, replace cursor with i, append \advance after i
418 \paragraph{Number:} $n(v)$
421 \item{\verb+cursor+}\\
422 create an \NNODE{}, set \verb+value=+$v$ in the \NNODE{}, replace
423 cursor with \NNODE{}, append $\tadvance$ after the \NNODE{} node
427 % create n node, replace cursor with n, append \advance after n
429 \paragraph{Apostrophe:} $o({}')$
432 \item{\verb+<g/p>[(sp[*#$]/g[o[@name='prime']$])#]/cursor+}\\
433 create a \ONODE{} node, set \verb+name='prime'+ in the \ONODE{},
434 append the \ONODE{} to the innermost \G{} node
435 \item{\verb+<g|p>[(sb[^sp[^*#$]/g[o[@name='prime']]$])#]/cursor+}\\
436 create a \ONODE{} node, set \verb+name='prime'+ in the \ONODE{},
437 append the \ONODE{} to the innermost \G{} node
438 \item{\verb+<g|p>[*#]/cursor+}\\
439 create a \ONODE{} node, set \verb+name='prime'+ in the \ONODE{},
440 create a \SP{} node, create a \G{} node, replace \texttt{*} with \SP{} node,
441 append the new \G{} node to the \SP{} node, append the \ONODE{}
442 node to the new \G{} node
443 \item{\verb+<g|p>[^#]/cursor+}\\
445 \item{\verb+cursor+}\\
446 cursor is not in a group, error?
450 % CASE: g[(sp[^*#$]/g[o[@name='prime']$])#]/cursor
451 % append a new o[@name='prime'] node to the inner g node
452 % CASE: g[(sb[^sp[^*#$]/g[o[@name='prime']]$])#]/cursor
453 % append a new o[@name='prime'] node to the inner g node
455 % create sp node, create g node, replace * with sp, append * to sp, append g to sp,
456 % append a new o[@name='prime'[ node to the new g node
460 % cursor is not in a group, error?
462 \paragraph{Other:} $o(v)$
464 create an \ONODE{}, set \verb+value=+$v$ in the \ONODE{}, replace
465 cursor with \ONODE{}, append $\tadvance$ after the \ONODE{} node
468 % create o node, replace cursor with o, append \advance after o
470 \paragraph{Active:} $\sim$
475 \paragraph{Comment:} $\%$
480 \paragraph{Begin Environment:} $c(\mathtt{begin})\langle\alpha_1,\dots,\alpha_n\rangle$
482 \paragraph{End Environment:} $c(\mathtt{end})\langle\rangle$
484 \paragraph{Left Delimiter:} $c(\mathtt{left})\langle\alpha\rangle$
486 \paragraph{Right Delimiter:} $c(\mathtt{right})\langle\alpha\rangle$
488 \paragraph{Carriage-Return:} $c(\mathtt{cr})\langle\rangle$
491 \item{\verb+row/cell/g/cursor+}\\
492 create a \ROW{} node, create a \CELL{} node, create a \G{}
493 node, append the cursor to the new \G{} node, append the new \G{}
494 node to the new \CELL{} node, append the new \CELL{} node to the
495 new \ROW{} node, insert the new \ROW{} node after the old \ROW{} node
496 \item{\verb+cursor+}\\
501 % CASE: row/cell/g/cursor
502 % create row node, create cell node, create g node,
503 % append cursor to g, append g to cell, append cell to row,
504 % insert new row after old row
508 \paragraph{Macro:} $c(v)\langle\alpha_1,\dots,\alpha_n\rangle$
511 \item{\verb+<p|g>/cursor+}\\
512 create a \CNODE{} node with the children corresponding to the pattern
513 $\tmap{\alpha_1}$,\dots,$\tmap{\alpha_n}$, replace the cursor with
514 the new \CNODE{} node. put $\tnext$ as the first child of the new
517 \item{\verb+*/cursor+}\\
518 create a \CNODE{} node with the children corresponding to the pattern
519 $\tmap{\alpha_1}$,\dots,$\tmap{\alpha_n}$, replace the cursor with
520 the new \CNODE{} node, put $\tnext$ as the first child of the new
521 \CNODE{} node. If $n\ne0$ emit a warning (the macro has arguments but
522 but the context wouldn't normally allow them to be entered)
527 % create a c node with children corresponding to the pattern of the macro
528 % append \nextparam as first child of the macro
530 \section{Left Drop Rules}
532 \paragraph{Normal Left Drop:} $\NLDROP$
536 \item{\verb+cursor+}\\
537 replace the cursor with the $\NLDROP$.
541 \paragraph{Special Left Drop:} $\SLDROP$
545 \item{\verb+cursor+}\\
546 replace the cursor with the $\SLDROP$.
550 \section{Right Drop Rules}
554 \item{\verb+cursor+}\\
555 replace the cursor with the $\RDROP$.
559 \section{$\varepsilon$-rules}
561 \paragraph{Nromal Left Drop}
565 \item{\verb+math/g[^#]/+$\NLDROP$}\\
566 repalce the $\NLDROP$ with the cursor.
568 %**************************************************************************************
569 %****************************** epsilon-rules with \NLDROP ****************************
570 %**************************************************************************************
572 %************** \NLDROP has neither preceding nor following nodes ********************
574 \item{\verb+math[^#$]/+$\NLDROP$}\\
575 replace the $\NLDROP$ with the cursor.
577 \item{\verb+g[^#$]/+$\NLDROP$}\\
578 replace the \G{} node with the $\NLDROP$.
580 % this rule is overridden by the two ones below
581 \item{\verb+c/p[^#$]/+$\NLDROP$}\\
582 remove the $\NLDROP$ and insert it before the \PNODE{} node.
584 \item{\verb+c[p[@left-open='1'][*]#$]/p[@right-open='1'][^#$]/+$\NLDROP$}\\
585 replace the \CNODE{} node with the content of the first \PNODE{} node and insert the $\NLDROP$ after this content
587 \item{\verb+c[p[@left-open='1'][!*]#$]/p[@right-open='1'][^#$]/+$\NLDROP$}\\
588 replace the \CNODE{} node with the $\NLDROP$.
590 \item{\verb+c[^#][!p(*)]/+$\NLDROP$}\\
591 replace the \CNODE{} node with the $\NLDROP$.
593 \item{\verb+cell[^#$]/+$\NLDROP$}\\
594 replace the cell with the $\NLDROP_n$.
596 \item{\verb+table[^#$]/+$\NLDROP$}\\
597 replace the \TABLE{} node with the $\NLDROP$.
599 %************************* \NLDROP has at least one preceding node *********************
603 % this rule should also handles the case where the \NLDROP is the third (and last) child of a script.
604 \item{\verb+*[*#]/+$\NLDROP$}\\
605 remove the $\NLDROP$ and append it as the last child of its ex preceding brother.
607 % this rule overrides the one above
608 \item{\verb+*[(i|n|o|s|c[!*])#]/+$\NLDROP$}\\
609 remove the $\NLDROP$ and replace the token with the $\NLDROP_n$.
613 \item{\verb+<sp|sb>[^*#$]+/$\NLDROP$}\\
614 replace the script node with its first child and insert the $\NLDROP$ after it.
616 % this rule overrides the one above.
617 \item{\verb+<sp|sb>[^g[!@id][!*]#$]/+$\NLDROP$}\\
618 replace the script with the cursor.
620 % this rule overrides the one above
621 \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']++\verb+o[@name='prime']$]]#]/+$\NLDROP$}\\
622 remove the last \ONODE{} node and replace the $\NLDROP$ with the cursor.%$\NLDROP_n$.
624 \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']$]]#]/+$\NLDROP$}\\
625 replace the script with its first child and replace the $\NLDROP$ with the cursor.%$\NLDROP_n$.
627 \item{\verb+c[(i|n|o|s|c[!*])#]/+$\NLDROP$}\\
628 move the $\NLDROP$ before the delimiter.
630 % this rule is true for both right-open and parameterized macros.
631 \item{\verb+c[p#]/+$\NLDROP$}\\
632 move the $\NLDROP$ into the \PNODE{} node.
634 %**************** \NLDROP has no preceding nodes, but has following nodes **************
637 \item{\verb+*[^#*]/+$\NLDROP$}\\
638 remove the $\NLDROP$ and insert it before its parent.
642 % this rule is applicable to all macros.
643 \item{\verb+c[^#][p[*]]/+$\NLDROP$}\\
644 remove the $\NLDROP$ and insert it before the \CNODE{} node.
648 \paragraph{Special Left Drop}
652 %********************************************************************************************************
653 %************************************ epsilon-rules with \SLDROP ****************************************
654 %********************************************************************************************************
656 \item{\verb+math/+$\SLDROP$}\\
657 replace the $\SLDROP$ with the cursor.
659 \item{\verb+math/g[^#]/+$\NLDROP$}\\
660 replace the $\NLDROP$ with the cursor.
662 %************************ \SLDROP has neither preceding nor following nodes *****************************
664 \item{\verb+g[^#$]/+$\SLDROP$}\\
665 replace the \G{} node with the cursor.
667 \item{\verb+c[p[@left-open='1'][*]#$]/p[@right-open='1'][^#$]/+$\SLDROP$}\\
668 replace the \CNODE{} node with the content of the first \PNODE{} node and insert the cursor after this content
670 \item{\verb+c[p[@left-open='1'][!*]#$]/p[@right-open='1'][^#$]/+$\SLDROP$}\\
671 replace the \CNODE{} node with the cursor.
673 \item{\verb+c/p[^#$]/+$\SLDROP$}\\
674 remove the $\SLDROP$ and insert it before the \PNODE{} node.
676 \item{\verb+c[^#][!p(*)]/+$\SLDROP$}\\
677 replace the \CNODE{} node with the cursor.
679 \item{\verb+cell[^#$]/+$\SLDROP$}\\
680 replace the cell with the $\NLDROP_n$.
682 \item{\verb+table[^#$]/+$\SLDROP$}\\
683 replace the \TABLE{} node with the cursor.
685 %*********************** \SLDROP has at least one preceding node ***********************************
687 \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']++\verb+o[@name='prime']$]]#]/+$\SLDROP$}\\
688 remove the last \ONODE{} node and replace the $\SLDROP$ with the cursor.
690 \item{\verb+*[sp[!@id][^*g[!@id][^o[@name='prime']$]]#]/+$\SLDROP$}\\
691 replace the script with its first child and replace the $\SLDROP$ with the cursor.%$\NLDROP_n$.
693 \item{\verb+<sp|sb>[^g[!@id][!*]#$]/+$\SLDROP$}\\
694 replace the script with the cursor.
696 % this rule is overridden by the three rules above.
697 \item{\verb+<sp|sb>[^*#$]+/$\SLDROP$}\\
698 replace the script node with its first child and insert the cursor after it.
700 \item{\verb+c[(i|n|o|s|c[!*])#]/+$\SLDROP$}\\
701 remove the $\SLDROP$ and insert the cursor before the delimiter.
703 \item{\verb+c[p#(i|n|o|s|c[!*])]/+$\SLDROP$}\\
704 remove the $\SLDROP$ and insert the cursor into the \PNODE{} node.
706 \item{\verb+c[p[@right-open='1']#]+}\\
707 remove the $\SLDROP$ and append the curor as last child of the \PNODE{} node.
709 % this rule is overridden by the two ones above.
710 \item{\verb+c[p#]/+$\SLDROP$}\\
711 move the $\SLDROP$ into the \PNODE{} node.
713 \item{\verb+*[(i|n|o|s|c[!*])#]/+$\SLDROP$}\\
714 remove the $\SLDROP$ and replace the token with the cursor.
716 \item{\verb+*[table#]/+$\SLDROP$}\\
717 remove the $\SLDROP$ and append the $\NLDROP_n$ as the last child of the \TABLE{} node.
719 \item{\verb+*[c#]/+$\SLDROP$}\\
720 move the $\SLDROP$ into the \CNODE{} node.
722 \item{\verb+*[g#]/+$\SLDROP$}\\
723 remove the $\SLDROP$ and append the cursor as the last child of the \G{} node.
725 %********** \SLDROP has no preceding node, but has following ones **************
727 \item{\verb+c[^#p][p(*)]/+$\SLDROP$}\\
728 remove the $\SLDROP$ and insert the cursor before the \CNODE{} node.
731 \item{\verb+*[^#*]/+$\SLDROP$}\\
732 remove the $\SLDROP$ and insert the cursor before its parent.
736 \paragraph{Normalize Left Drop}
740 %****************************************************************************************
741 %***************************** epsilon-rules with \NLDROP_n *****************************
742 %****************************************************************************************
744 \item{\verb+*[*#]/+$\NLDROP_n$}\\
745 replace the $\NLDROP_n$ with the cursor.
747 \item{\verb+row[cell#]/+$\NLDROP_n$}\\
748 remove the $\NLDROP_n$ and append the cursor as the last child of the \CELL{} node.
750 \item{\verb+row[^#$]/+$\NLDROP_n$}\\
751 replace the \ROW{} node with the $\NLDROP_n$
753 \item{\verb+table[row#]/+$\NLDROP_n$}\\
754 remove the $\NLDROP_n$ and append it as last child of the \ROW{} node.
756 \item{\verb+table[^#$]/+$\NLDROP_n$}\\
757 replace the \TABLE{} with the cursor.%$\NLDROP_n$.
759 \item{\verb+g[@id][^#$]/+$\NLDROP_n$}\\
760 replace the \G{} node with the $\NLDROP_n$.
763 replace the $\NLDROP_n$ with the cursor.
767 \paragraph{Right Drop}
771 %************************* \RDROP has at least a following node ****************************************
773 \item{\verb+c[#(i|n|o|s|c[!*])]/+$\RDROP$}\\
774 remove the $\RDROP$ and append it after the delimiter
776 \item{\verb+*[#(i|n|o|s|c[!*])]/+$\RDROP$}\\
777 remove the token and replace the $\RDROP$ with the cursor $\RDROP_n$.
779 % this rule is overridden by those ones above.
780 \item{\verb+*[#*]/+$\RDROP$}\\
781 remove the $\RDROP$ and append it as the first child of the following node.
783 %************************** \RDROP has neither following nor preceding nodes ******************************
785 \item{\verb+c[#$][!p[*]]/+$\RDROP$}\\
786 replace the \CNODE{} with the $\RDROP$.
788 \item{\verb+p[^#$]/+$\RDROP$}\\
789 move the $\RDROP$ after the \PNODE{} node.
791 \item{\verb+g[^#$]/+$\RDROP$}\\
792 replace the \G{} node with the $\RDROP$.
796 \paragraph{Normalize Right Drop}
800 % at the moment it's the only rule, defined for this symbol.
801 \item{\verb+g[@id][^#$]/+$\RDROP_n$}\\
802 replace the \G{} node with the $\RDROP_n$.
805 replace the $\RDROP$ with the cursor.
812 \item{\verb+g/+$\tadvance$}\\
813 replace $\tadvance$ with the cursor
815 \item{\verb+p[#$]/+$\tadvance$}\\ %$
816 put $\tadvance$ after the \PNODE{} node
818 \item{\verb+c[#p]/+$\tadvance$} \\
819 remove $\tadvance$, put the cursor as first child of the \PNODE{} node
821 \item{\verb+c[#*]/+$\tadvance$} \\ %$
822 replace $\tadvance$ with the cursor
824 \item{\verb+c[#$]/+$\tadvance$} \\ %$
825 move $\tadvance$ after the \CNODE{} node
828 \paragraph{Next Parameter}
830 \paragraph{Next Token}
832 %% \begin{description}
833 %% \item{\verb+c[#p]/+$\tnext$} \\
836 % g[@id]/(c[#$][@right-open]/g[!@id][#$]/)+cursor } let p = cursor.parent() in remove; advance(p)
840 % */cursor { let g = new group in replace
842 % g[@id][^#$]/cursor <= cursor.parent().replace(cursor)
843 % g[@id][^#$]/cursor <- cursor
844 % (!g[@id][^#$])[A#B]/(g[@id][^#$]/)+cursor <- (!g[@id][^#$])[A#B]/cursor
848 \section{Semantics of the Regular Context Language}
850 \newcommand{\CSEM}[2]{\mathcal{C}\llbracket#1\rrbracket#2}
851 \newcommand{\QSEM}[2]{\mathcal{Q}\llbracket#1\rrbracket#2}
852 \newcommand{\TSEMUP}[2]{\mathcal{T}^\uparrow\llbracket#1\rrbracket#2}
853 \newcommand{\TSEMDOWN}[2]{\mathcal{T}_\downarrow\llbracket#1\rrbracket#2}
854 \newcommand{\NSEM}[2]{\mathcal{N}\llbracket#1\rrbracket#2}
855 \newcommand{\PSEM}[1]{\mathcal{P}\llbracket#1\rrbracket}
856 \newcommand{\LSEM}[2]{\mathcal{L}\llbracket#1\rrbracket#2}
857 \newcommand{\RSEM}[2]{\mathcal{R}\llbracket#1\rrbracket#2}
858 \newcommand{\FSEM}[2]{\mathcal{F}\llbracket#1\rrbracket(#2)}
859 \newcommand{\PARENT}[1]{\mathit{parent}(#1)}
860 \newcommand{\CHILDREN}[1]{\mathit{children}(#1)}
861 \newcommand{\CHILD}[1]{\mathit{child}(#1)}
862 \newcommand{\ANCESTORS}[1]{\mathit{ancestors}(#1)}
863 \newcommand{\DESCENDANTS}[1]{\mathit{descendants}(#1)}
864 \newcommand{\HASATTRIBUTE}[2]{\mathit{hasAttribute}(#1,#2)}
865 \newcommand{\HASNOATTRIBUTE}[2]{\mathit{hasNoAttribute}(#1,#2)}
866 \newcommand{\ATTRIBUTE}[2]{\mathit{attribute}(#1,#2)}
867 \newcommand{\ISELEMENT}[1]{\mathit{isElement}(#1)}
868 \newcommand{\NAME}[1]{\mathit{name}(#1)}
869 \newcommand{\PREV}[1]{\mathit{prev}(#1)}
870 \newcommand{\NEXT}[1]{\mathit{next}(#1)}
871 \newcommand{\PREDICATE}[1]{\mathit{predicate}(#1)}
872 \newcommand{\IFV}[3]{\begin{array}[t]{@{}l}\mathbf{if}~#1~\mathbf{then}\\\quad#2\\\mathbf{else}\\\quad#3\end{array}}
873 \newcommand{\IFH}[3]{\mathbf{if}~#1~\mathbf{then}~#2~\mathbf{else}~#3}
874 \newcommand{\TRUE}{\mathbf{true}}
875 \newcommand{\FALSE}{\mathbf{false}}
876 \newcommand{\FUN}[2]{\lambda#1.#2}
877 \newcommand{\LET}[3]{\mathbf{let}~#1=#2~\mathbf{in}~#3}
878 \newcommand{\REC}[2]{\mathbf{rec}~#1=#2}
879 \newcommand{\APPLY}[2]{(#1\;#2)}
880 \newcommand{\APPLYX}[3]{(#1\;#2\;#3)}
881 \newcommand{\AND}{\wedge}
882 \newcommand{\OR}{\vee}
883 \newcommand{\AAND}{\,\vec{\AND}\,}
884 \newcommand{\AOR}{\,\vec{\OR}\,}
885 \newcommand{\MATCH}[4]{\begin{array}[t]{@{}c@{~\to~}l@{}l@{}}\multicolumn{2}{@{}l@{}}{\mathbf{match}~#1~\mathbf{with}}\\\phantom{|}\quad\{#2\}\\|\quad\emptyset\end{array}}
889 \CSEM{q}{x} &=& \{x_1\mid x_1\in\{x\} \wedge \QSEM{q}{x_1}\}\\
890 \CSEM{..}{x} &=& \PARENT{x}\\
891 \CSEM{/}{x} &=& \CHILDREN{x}\\
892 \CSEM{c_1\;c_2}{x} &=& \CSEM{c_2}{\CSEM{c_1}{x}}\\
893 \CSEM{(c)}{x} &=& \CSEM{c}{x}\\
894 \CSEM{\{c:\alpha\}}{x} &=& \alpha(x,\CSEM{c}{x})\\
895 \CSEM{c_1\&c_2}{x} &=& \CSEM{c_1}{x} \cap \CSEM{c_2}{x}\\
896 \CSEM{c_1\mid c_2}{x} &=& \CSEM{c_1}{x} \cup \CSEM{c_2}{x}\\
897 \CSEM{c+}{x} &=& \CSEM{c}{x} \cup \CSEM{c+}{\CSEM{c}{x}}\\
898 \CSEM{c?}{x} &=& \{x\}\cup\CSEM{c}{x}\\
899 \CSEM{c*}{x} &=& \CSEM{{c+}?}{x}\\[3ex]
900 \QSEM{c}{x} &=& \CSEM{c}{x}\ne\emptyset\\
901 \QSEM{!c}{x} &=& \CSEM{c}{x}=\emptyset\\
902 \QSEM{\langle*\rangle}{x} &=& \TRUE\\
903 \QSEM{\langle n\rangle}{x} &=& \NAME{x}=n\\
904 \QSEM{@n}{x} &=& \HASATTRIBUTE{x}{n}\\
905 \QSEM{@n=v}{x} &=& \ATTRIBUTE{x}{n}=v\\
906 \QSEM{[p_1\#p_2]}{x} &=& \LSEM{p_1}{\PREV{x}}\wedge\RSEM{p_2}{\NEXT{x}}\\[3ex]
907 \LSEM{}{\alpha} &=& \TRUE\\
908 \LSEM{\cent}{\alpha} &=& \alpha=\emptyset\\
909 \LSEM{p\;q}{\emptyset} &=& \FALSE\\
910 \LSEM{p\;q}{\{x\}} &=& \QSEM{q}{x}\wedge\LSEM{p}{\PREV{x}}\\[3ex]
911 \RSEM{}{\alpha} &=& \TRUE\\
912 \RSEM{\$}{\alpha} &=& \alpha=\emptyset\\
913 \RSEM{q\;p}{\emptyset} &=& \FALSE\\
914 \RSEM{q\;p}{\{x\}} &=& \QSEM{q}{x}\wedge\RSEM{p}{\NEXT{x}}\\[3ex]
915 \PREDICATE{q} &=& \TRUE\\
916 \PREDICATE{..} &=& \FALSE\\
917 \PREDICATE{/} &=& \FALSE\\
918 \PREDICATE{c_1\;c_2} &=& \PREDICATE{c_1}\wedge\PREDICATE{c_2}\\
919 \PREDICATE{(c)} &=& \PREDICATE{c}\\
920 \PREDICATE{c_1\&c_2} &=& \PREDICATE{c_1}\wedge\PREDICATE{c_2}\\
921 \PREDICATE{c_1\mid c_2} &=& \PREDICATE{c_1}\wedge\PREDICATE{c_2}\\
922 \PREDICATE{c+} &=& \PREDICATE{c}\\
923 \PREDICATE{c?} &=& \PREDICATE{c}\\
924 \PREDICATE{c*} &=& \PREDICATE{c}
930 \PSEM{q} &=& \FUN{x}{\APPLY{\QSEM{q}{}}{x}} \\
931 \PSEM{..} &=& \FUN{x}{\neg\APPLY{\mathit{null}}{\PARENT{x}}}\\
932 \PSEM{/} &=& \FUN{x}{\neg\APPLY{\mathit{null}}{\CHILD{x}}}\\
933 \PSEM{(c)} &=& \PSEM{c}\\
934 \PSEM{\{c:\alpha\}} &=& \FUN{x}{\APPLY{\PSEM{c}}{x}\AAND\APPLY{\alpha}{x}}\\
935 \PSEM{c_1\;c_2} &=& \IFV{\PREDICATE{c_1}}{\FUN{x}{(\PSEM{c_1}\;x)\wedge(\PSEM{c_2}\;x)}}{\FSEM{c_1}{\PSEM{c_2},\FUN{\_}{\FALSE}}}\\
936 \PSEM{c_1\&c_2} &=& \IFV{\PREDICATE{c_1}\wedge\PREDICATE{c_2}}{\FUN{x}{(\PSEM{c_1}\;x)\wedge(\PSEM{c_2}\;x)}}{\FSEM{c_1\&c_2}{\FUN{\_}{\TRUE},\FUN{\_}{\FALSE}}}\\
937 \PSEM{c_1\mid c_2} &=& \FUN{x}{(\PSEM{c_1}\;x)\vee(\PSEM{c_2}\;x)}\\
938 \PSEM{c+} &=& \PSEM{c}\\
939 \PSEM{c?} &=& \FUN{\_}{\TRUE}\\
940 \PSEM{c*} &=& \FUN{\_}{\TRUE}\\[3ex]
941 \FSEM{q}{t,f} &=& \FUN{x}{(\APPLY{\PSEM{q}}{x}\AAND\APPLY{t}{x})\AOR\APPLY{f}{x}}\\
942 \FSEM{..}{t,f} &=& \FUN{x}{\MATCH{\PARENT{x}}{y}{\APPLY{t}{y}}{\APPLY{f}{x}}}\\
943 % \FSEM{/}{t,f} &=& \FUN{x}{(\vee_{y\in\CHILDREN{x}} \APPLY{t}{y})\AOR\APPLY{f}{x}}\\
944 \FSEM{/}{t,f} &=& \FUN{x}{\APPLYX{\mathit{exists}}{t}{\CHILD{x}}\AOR\APPLY{f}{x}}\\
945 \FSEM{(c)}{t,f} &=& \FSEM{c}{t,f}\\
946 \FSEM{\{c:\alpha\}}{t,f} &=& \FSEM{c}{\FUN{x}{\PSEM{c}\AAND\APPLY{\alpha}{x}\AAND\APPLY{t}{x},f}}\\
947 \FSEM{c_1\;c_2}{t,f} &=& \FUN{x}{\APPLY{\FSEM{c_1}{\FSEM{c_2}{t,\FUN{\_}{\APPLY{f}{x}}},f}}{x}}\\
948 \FSEM{c_1\&c_2}{t,f} &=& \FUN{x}{\APPLY{\FSEM{c_1}{\FUN{y}{\APPLY{\FSEM{c_2}{\FUN{z}{(y=z)\AAND\APPLY{t}{z}},\FUN{\_}{\APPLY{f}{x}}}}{x}},f}}{x}}\\
949 \FSEM{c_1\mid c_2}{t,f} &=& \FSEM{c_1}{t,\FSEM{c_2}{t,f}}\\
950 \FSEM{c+}{t,f} &=& \FSEM{c}{\FSEM{c+}{t,t},f}\\
951 \FSEM{c?}{t,f} &=& \FSEM{c}{t,t}\\
952 \FSEM{c*}{t,f} &=& \FSEM{{c+}?}{t,f}\\[3ex]
953 \QSEM{c}{} &=& \PSEM{c}\\
954 \QSEM{!c}{} &=& \FUN{x}{\neg\APPLY{\PSEM{c}}{x}}\\
955 \QSEM{\langle*\rangle}{} &=& \FUN{\_}{\TRUE}\\
956 \QSEM{\langle n\rangle}{} &=& \FUN{x}{\NAME{x}=n}\\
957 \QSEM{@n}{} &=& \FUN{x}{\HASATTRIBUTE{x}{n}}\\
958 \QSEM{@n=v}{} &=& \FUN{x}{\ATTRIBUTE{x}{n}=v}\\
959 \QSEM{[p_1\#p_2]}{} &=& \FUN{x}{\APPLY{\LSEM{p_1}{}}{\PREV{x}}\wedge\APPLY{\RSEM{p_2}{}}{\NEXT{x}}}\\[3ex]
960 \LSEM{}{} &=& \FUN{\_}{\TRUE}\\
961 \LSEM{\cent}{} &=& \mathit{null}\\
962 \LSEM{p\;q}{} &=& \FUN{x}{\MATCH{x}{y}{\QSEM{q}{y}\AAND\APPLY{\LSEM{p}}{\PREV{y}}}{\FALSE}}\\
963 \RSEM{}{} &=& \FUN{\_}{\TRUE}\\
964 \RSEM{\$}{} &=& \mathit{null}\\
965 \RSEM{p\;q}{} &=& \FUN{x}{\MATCH{x}{y}{\QSEM{q}{y}\AAND\APPLY{\RSEM{p}}{\NEXT{y}}}{\FALSE}}\\
966 \mathit{null} &=& \FUN{x}{\MATCH{x}{\_}{\FALSE}{\TRUE}}\\
967 \mathit{exists} &=& \FUN{t}{\REC{a}{\FUN{x}{\MATCH{x}{y}{\APPLY{t}{y}\AOR\APPLY{a}{\NEXT{x}}}{\FALSE}}}}