1 \documentclass[11pt,epsf,a4wide]{article}
2 \usepackage{latexsym, amssymb, amsfonts}
5 \addtolength{\topmargin}{-1.7cm}
6 \addtolength{\oddsidemargin}{-1.5cm}
7 \addtolength{\evensidemargin}{-1.5cm}
8 \addtolength{\textwidth}{3cm}
9 \addtolength{\textheight}{3.4cm}
11 \def\mowgli{{\sc MoWGLI}}
13 \title{A complex use case for XML-technology:\\
14 \small{The European Project IST-2001-33562 MoWGLI}}
24 The following paper contains the results of an extensive validation
25 of XML-technology conducted during the last three years in the framework
26 of the European Project IST-2001-33562 \mowgli.
29 \section{Introduction}
30 The European Project IST-2001-33562 \mowgli, activated in march 2002 and
31 lasting three years, aimed at the exploitation of semantic based
32 techniques for mathematical knowledge management\cite{mkm03,mkm04},
34 emphasis on {\em web publishing}, {\em transformation} and {\em
35 searching and retrieving} issues. \mowgli was based on an
36 extensive use of XML-technology, aiming both at providing a
37 major XML test-bench and at becoming an example of best-practice in its
40 To grasp the dimension of the validation, we managed a
41 repository of $tot$ fully structured xml-documents\footnote{Each document is
42 piece of formal mathematics (a definition or a theorem), exported
43 from the library of a well-known tool for the automatic support to formal
44 reasoning: the Coq proof assistant} of very different sizes,
45 spanning from very few elements to $tot$ gigabytes (for a total of
46 $tot$ gigabytes). We wrote 13951 lines of XSLT (39 stylesheets)
47 plus $13398$ additional lines ($21$ stylesheets) which are
48 {\em automatically generated}
49 starting from a very high-level
50 xml description of mathematical notation\footnote{We also reused
51 a stylesheet transforming MathML content to
52 MathML presentation (4007 lines) developed at \dots.}. Each document
53 in the repository has an associated RDF file mostly modeling, in addition
54 to traditional Dublin-Core metadata, dependendy relations among the
55 objects in the repository. This kind of metadata are automatically
56 computed from the structured description of the object, and extensively
57 used to improve both searching and browsing functionality (e.g. providing
58 graphical descriptions of the dependency relations among the object in
61 A lot of different tools have been tested during the development of
62 the project covering most aspects of XML-technology.
63 Our validation effort provided valuable feedback to many developers
64 of these tools, both in the form of bug reports (abbiamo idea di quanti
65 bug reports abbiamo redatto e della loro classificazionI? quanti cioe'
66 gravi?) and suggestions for improvement (especially in performance).
68 We also actively contributed to software development. In particular,
69 let us mention the implementation of Gdome2 \cite{Gdome2}, a level2
70 compliant DOM api written in C++ for the gnome programming environment,
71 that was mostly developed and tuned within the framework of the
72 \mowgli project, and GtkMathView (\cite{}), a rendering widget for
75 Summing up, \mowgli has been a very intense project, explicitly aimed
76 to {\em stress} XML-technology up to its very limit (and possibly a
77 little beyond, as in the case of stylesheets). The following paper
78 is just a report of our experience.
80 Cenni sulla struttura del paper? Vediamo alla fine.
83 \section{The repository}
84 The aim of \mowgli was to test the feasibility of passing from a machine
85 readable to a machine understandable encoding of mathematical information,
86 and to explore the the new potentialities offered by such
88 To this aim, we needed large collections of documents enriched with
89 semantic markup, and the natural solution was to use one of the many
90 interesting libraries of formalized mathematics already existent in
91 the world; in particular, we used the library of the Coq Proof assistant,
92 developed at INRIA Future.
93 This gave us a pretty large ($tot$) repository of {\em fully structured}
94 mathematical documents: the actual dtd contains {\em no textual elements
95 at all}. As a matter of fact, text is the most typical example of
96 information which is machine-readable but not machine understandable.
97 By banishing text, we intentionally adopted (here, as in many other
98 aspects of our project) an {\extreme} position, not as a phisolophical
99 choice or commitment, but as a mere working hypothesis.
101 The actual details of the DTD are not so relevant here. Let us just say
102 that we had essentially two main classes of documents, characterized by
103 the different nesting depth of the markup: {\b proof objects}
104 (usually quite deep) and the corresponding {\b intermediate goals}
105 (relatively flat, being collections of formulas).
107 Anticipare il perche' dell'enfasi sulla profondita', se (come
108 pensiamo) il comportamento risutla essere sensibile a questo
109 fattore. Se non lo e', cassiamo direttamente il discorso
110 profondita' (o semplicemnte diciamo che non sembra essere rilevante).
112 Adesso mettiamo un po' di dati sulla libreria dimensione minima,
114 idem per profondita' e larghezza.
116 Cenni sul DTD? Spiegare se/perche' non abbiamo mai sentito bisogno
121 \begin{thebibliography}{}
122 \bibitem{mkm03}A.Asperti, B.Buchberger, J.Davenport (eds).
123 Proceedings of the Second International Conference on Mathematical
124 Knowledge Management, MKM 2003. Bertinoro, Italy. LNCS, 2594.
125 \bibitem{mkm04}A.Asperti, G.Bancerek, A.Trybulec (eds).
126 Proceeding of the Third International Conference on
127 Mathematical Knowledge Management, MKM 2004. Bialowieza, Poland. LNCS 3119.
128 \bibitem{Gdome2} P. Casarini, L.Padovani. The Gnome DOM Engine.
129 Markup Languages: Theory \& Practice, Vol. 3, Issue 2, pp. 173--190,
130 ISSN 1099-6621, MIT Press, April 2002.
131 \end{thebibliography}