\usepackage{amssymb}
\usepackage{stmaryrd}
-\title{A MathML Editor Based on \TeX{} Syntax\\Formal Specification}
+\title{\EdiTeX: a MathML Editor Based on \TeX{} Syntax\\\small Description and Formal Specification}
\author{Paolo Marinelli\\Luca Padovani\\\small\{{\tt pmarinel},{\tt lpadovan}\}{\tt @cs.unibo.it}\\\small Department of Computer Science\\\small University of Bologna}
\date{}
+\newcommand{\EdiTeX}{Edi\TeX}
+
\newcommand{\tmap}[1]{\llbracket#1\rrbracket}
\newcommand{\tadvance}{\vartriangle}
\newcommand{\tnext}{\rhd}
\maketitle
+\section{Introduction}
+
+MathML~\cite{MathML1,MathML2,MathML2E} is an XML application for the
+representation of mathematical expressions. As most XML applications,
+MathML is unsuitable to be hand-written, except for the simplest
+cases, because of its verbosity. In fact, the MathML specification
+explicitly states that
+\begin{quote}
+``While MathML is human-readable, it is anticipated that, in all but
+the simplest cases, authors will use equation editors, conversion
+programs, and other specialized software tools to generate MathML''
+\end{quote}
+
+The statement about human readability of MathML is already too strong,
+as the large number of mathematical symbols, operators, and
+diacritical marks that are used in mathematical notation cause MathML
+documents to make extensive use of Unicode characters that typically
+are not in the ``visible'' range of common text editors. Such
+characters may appear as entity references, whose name indicates
+somehow the kind of symbol used, or character references or they are
+directly encoded in the document encoding scheme (for instance,
+UTF-8).
+
+It is thus obvious that authoring MathML documents assumes the
+assistance of dedicated tools. As of today, such tools can be
+classified into two main categories:
+\begin{enumerate}
+ \item WYSIWYG (What You See Is What You Get) editors that allow the
+ author to see the formatted document on the screen as it is
+ composed;
+ \item conversion tools that generate MathML markup from different
+ sources, typically other markup languages for scientific
+ documents, such as \TeX.
+\end{enumerate}
+
+While the former tools are certainly more appealing, especially to the
+unexperienced user, as they give a direct visual feedback, the
+existance of tools in the second category takes into account the large
+availability of existing documents in \TeX{} format, and also the fact
+that experienced or ``lazy'' users may continue to prefer the use of a
+markup language other than MathML for editing, and generate MathML
+only as a final step of the authoring process. The ``laziness'' is not
+really intended as a way of being reluctant towards a new technology,
+but rather as a justified convincement that WYSIWYG editors are ``nice
+to look at'' but after all they may slow down the authoring process.
+WYSIWYG editors often involve the use of menus, palettes of symbols,
+and, in general, an extensive use of the pointing device (the mouse)
+for completing most operations. The use of shortcuts is of little
+help, as it implies very soon a challenging exercise for the fingers
+and the mind. Moreover, authors \emph{cannot improve} their authoring
+speed with time. On the other side, the gap between the syntax of any
+markup language for mathematics and mathematical notation may be
+relevant, especially for large, non-trivial formulas and authoring is
+a re-iterated process in which the author repeadtedly types the markup
+in the editor, compiles, and looks at the result inside a pre-viewer.
+
+\EdiTeX{} tries to synthesize the ``best of both worlds'' in a single
+tool. The basic idea is that of creating a WYSIWYG editor in which
+editing is achieved by typing \TeX{} markup as the author would do in
+a text editor. The \TeX{} markup is tokenized and parsed on-the-fly
+and a corresponding MathML representation is created and
+displayed. This way, the author can see the rendered document as it
+changes. The advantages of this approach can be summarized as follows:
+\begin{itemize}
+ \item the document is rendered concurrently with the editing, the
+ user has an immediate feedback hence it is easier to spot errors;
+ \item the author types in a concrete (and likely familiar) syntax
+ improving the editing speed;
+ \item the usual WYSIWYG mechanisms are still available. In
+ particular, it is possible to select \emph{visually} a fragment of
+ the document that needs re-editing, or that was left behind for
+ subsequent editing.
+\end{itemize}
+
+\paragraph{The Name of the Game:} there is no reference to MathML in
+the name ``\EdiTeX.'' In fact, the architecture of the editor is not
+tied to MathML markup. Although we focus on MathML editing, by
+changing a completely modularized component of the editor it is
+virtually possible to generate any other markup language.
+
+\paragraph{Acknowledgments.} Stephen M. Watt and Igor Rodionov for
+their work on the \TeX{} to MathML conversion tool; Stan Devitt for an
+illuminating discussion about the architecture of \TeX{} to XML
+conversion tools; Claudio Sacerdoti Coen for the valuable feedback and
+uncountable bug reports.
+
+\section{Architecture}
+
+\section{Customization}
+
+\subsection{Short and Long Identifiers}
+
+\subsection{The Dictionary}
+
+\subsection{Stylesheets and Trasformations}
+
+\subsection{Rendering}
+
+\section{XML Representation of \TeX{} Markup}
+
\section{Tokens}
The following tokens are defined: