helm/papers/use_case/use_case.tex

   1 \documentclass[11pt,epsf,a4wide]{article}
   2 \usepackage{latexsym, amssymb, amsfonts}
   3 \usepackage{graphicx}
   4
   5 \addtolength{\topmargin}{-1.7cm}
   6 \addtolength{\oddsidemargin}{-1.5cm}
   7 \addtolength{\evensidemargin}{-1.5cm}
   8 \addtolength{\textwidth}{3cm}
   9 \addtolength{\textheight}{3.4cm}
  10
  11 \def\mowgli{{\sc MoWGLI}}
  12
  13 \title{A complex use case for XML-technology:\\
  14 \small{The European Project IST-2001-33562 MoWGLI}}
  15 \date{}
  16 \author{}
  17
  18 \begin{document}
  19 \maketitle
  20
  21 \thispagestyle{empty}
  22
  23 \begin{abstract}
  24 The following paper contains the results of an extensive validation
  25 of XML-technology conducted during the last three years in the framework
  26 of the European Project IST-2001-33562 \mowgli.
  27 \end{abstract}
  28
  29 \section{Introduction}
  30 The European Project IST-2001-33562 \mowgli, activated in march 2002 and
  31 lasting three years, aimed at the exploitation of semantic based
  32 techniques for mathematical knowledge management\cite{mkm03,mkm04},
  33 with particular
  34 emphasis on {\em web publishing}, {\em transformation} and {\em
  35 searching and retrieving} issues. \mowgli was based on an
  36 extensive use of XML-technology, aiming both at providing a
  37 major XML test-bench and at becoming an example of best-practice in its
  38 use.
  39
  40 To grasp the dimension of the validation, we managed a
  41 repository of $tot$ fully structured xml-documents\footnote{Each document is
  42 piece of formal mathematics (a definition or a theorem), exported
  43 from the library of a well-known tool for the automatic support to formal
  44 reasoning: the Coq proof assistant} of very different sizes,
  45 spanning from very few elements to $tot$ gigabytes (for a total of
  46 $tot$ gigabytes). We wrote 13951 lines of XSLT (39 stylesheets)
  47 plus $13398$ additional lines ($21$ stylesheets) which are
  48 {\em automatically generated}
  49 starting from a very high-level
  50 xml description of mathematical notation\footnote{We also reused
  51 a stylesheet transforming MathML content to
  52 MathML presentation (4007 lines) developed at \dots.}. Each document
  53 in the repository has an associated RDF file mostly modeling, in addition
  54 to traditional Dublin-Core metadata, dependendy relations among the
  55 objects in the repository. This kind of metadata are automatically
  56 computed from the structured description of the object, and extensively
  57 used to improve both searching and browsing functionality (e.g. providing
  58 graphical descriptions of the dependency relations among the object in
  59 the repository).
  60
  61 A lot of different tools have been tested during the development of
  62 the project covering most aspects of XML-technology.
  63 Our validation effort provided valuable feedback to many developers
  64 of these tools,  both in the form of bug reports (abbiamo idea di quanti
  65 bug reports abbiamo redatto e della loro classificazionI? quanti cioe'
  66 gravi?) and suggestions for improvement (especially in performance).
  67
  68 We also actively contributed to software development. In particular,
  69 let us mention the implementation of Gdome2 \cite{Gdome2}, a level2
  70 compliant DOM api written in C++ for the gnome programming environment,
  71 that was mostly developed and tuned within the framework of the
  72 \mowgli project, and GtkMathView (\cite{}), a rendering widget for
  73 MathML.
  74
  75 Summing up, \mowgli has been a very intense project, explicitly aimed
  76 to {\em stress} XML-technology up to its very limit (and possibly a
  77 little beyond, as in the case of stylesheets). The following paper
  78 is just a report of our experience.
  79
  80 Cenni sulla struttura del paper? Vediamo alla fine.
  81
  82
  83 \section{The repository}
  84 The aim of \mowgli was to test the feasibility of passing from a machine
  85 readable to a machine understandable encoding of mathematical information,
  86 and to explore the the new potentialities offered by such
  87 encoding.
  88 To this aim, we needed large collections of documents enriched with
  89 semantic markup, and the natural solution was to use one of the many
  90 interesting libraries of formalized mathematics already existent in
  91 the world; in particular, we used the library of the Coq Proof assistant,
  92 developed at INRIA Future.
  93 This gave us a pretty large ($tot$) repository of {\em fully structured}
  94 mathematical documents: the actual dtd contains {\em no textual elements
  95 at all}. As a matter of fact, text is the most typical example of
  96 information which is machine-readable but not machine understandable.
  97 By banishing text, we intentionally adopted (here, as in many other
  98 aspects of our project) an {\extreme} position, not as a phisolophical
  99 choice or commitment, but as a mere working hypothesis.
 100
 101 The actual details of the DTD are not so relevant here. Let us just say
 102 that we had essentially two main classes of documents, characterized by
 103 the different nesting depth of the markup: {\b proof objects}
 104 (usually quite deep) and the corresponding {\b intermediate goals}
 105 (relatively flat, being collections of formulas).
 106
 107 Anticipare il perche' dell'enfasi sulla profondita', se (come
 108 pensiamo) il comportamento risutla essere sensibile a questo
 109 fattore. Se non lo e', cassiamo direttamente il discorso
 110 profondita' (o semplicemnte diciamo che non sembra essere rilevante).
 111
 112 Adesso mettiamo un po' di dati sulla libreria dimensione minima,
 113 media max dei files,
 114 idem per profondita' e larghezza.
 115
 116 Cenni sul DTD? Spiegare se/perche' non abbiamo mai sentito bisogno
 117 di una schema?
 118
 119
 120
 121 \begin{thebibliography}{}
 122 \bibitem{mkm03}A.Asperti, B.Buchberger, J.Davenport (eds).
 123 Proceedings of the Second International Conference on Mathematical
 124 Knowledge Management, MKM 2003. Bertinoro, Italy. LNCS, 2594.
 125 \bibitem{mkm04}A.Asperti, G.Bancerek, A.Trybulec (eds).
 126 Proceeding of the Third International Conference on
 127 Mathematical Knowledge Management, MKM 2004. Bialowieza, Poland. LNCS 3119.
 128 \bibitem{Gdome2} P. Casarini, L.Padovani. The Gnome DOM Engine.
 129 Markup Languages: Theory \& Practice, Vol. 3, Issue 2, pp. 173--190,
 130 ISSN 1099-6621, MIT Press, April 2002.
 131 \end{thebibliography}
 132
 133
 134 \end{document}