From: Ferruccio Guidi Date: Fri, 12 Mar 2004 23:02:32 +0000 (+0000) Subject: updating X-Git-Tag: v0_0_4~18 X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=commitdiff_plain;h=1c84c6e1df257ad284a256ee0c2c1a203f81b713;p=helm.git updating --- diff --git a/helm/mathql/doc/mathql.tex b/helm/mathql/doc/mathql.tex index 30646a98f..522e405e1 100644 --- a/helm/mathql/doc/mathql.tex +++ b/helm/mathql/doc/mathql.tex @@ -7,135 +7,35 @@ \addtolength{\textwidth}{2.0cm} \addtolength{\topmargin}{-1.0cm} -\newcommand{\MathQL}{\textsc{mathql-1}} -\newcommand{\RDF}{\textsc{rdf}} -\newcommand{\RDFS}{\textsc{rdf schema}} -\newcommand{\HELM}{\textsc{helm}} -\newcommand{\POSIX}{\textsc{posix}} -\newcommand{\XML}{\textsc{xml}} -\newcommand{\CAML}{\textsc{caml}} -\newcommand{\SQL}{\textsc{sql}} -\newcommand{\PostgreSQL}{\textsc{postgresql}} -\newcommand{\Galax}{\textsc{galax}} -\newcommand{\XQuery}{\textsc{xquery}} +\newcommand\xcomment[1]{} +\newcommand\URI[1]{\texttt{<#1>}} + +\newcommand\MathQL{\textsc{mathql-1}} +\newcommand\RDF{\textsc{rdf}} +\newcommand\RDFS{\textsc{rdf schema}} +\newcommand\HELM{\textsc{helm}} +\newcommand\POSIX{\textsc{posix}} +\newcommand\XML{\textsc{xml}} +\newcommand\CAML{\textsc{caml}} +\newcommand\SQL{\textsc{sql}} +\newcommand\MySQL{\textsc{mysql}} +\newcommand\PostgreSQL{\textsc{postgresql}} +\newcommand\Galax{\textsc{galax}} +\newcommand\XQuery{\textsc{xquery}} \title{MathQL-1.4} \author{Ferruccio Guidi} +\bibliographystyle{numeric} + \begin{document} \maketitle -\section{Overview} - -{\MathQL} is a query language for {\RDF} databases, developed in the context -of the {\HELM} project. Its name suggests that it is supposed to be the first -of a group of query languages for retrieving information from distributed -digital libraries of formal mathematical knowledge, but no other languages of -this group have been implemented yet except for {\MathQL} that is not -Mathematics-oriented. So the name is a bit misleading. - -{\MathQL} is carefully designed for having the following features: - -\begin{enumerate} - -\item -compliance with the main requirements stated by the {\RDF} community; - -\item -native support for post-processing the query results; - -\item -{\HELM}-independent implementation of the query engine. - -\end{enumerate} - -We will briefly analyze these features in the remaining part of this section. - -\subsubsection*{The main requirements from the RDF community} - -As a query language for {\RDF} databases, {\MathQL} has a well-conceived -semantics, defined in term of an abstract metadata model, according to which -queries return exhaustive solutions. -The language provides facilities for imposing query constraints based on -{\RDFS} and for the traversal of compound values of properties. -It also provides a full set of Boolean operators to compose the query -constraints and facilities for selecting resources or literals by means of -{\POSIX} regular expressions. -Moreover the language allows to customize the query results specifying what -part of a solution should be preserved, and supports a machine-processable -{\XML} syntax as well as a human-readable textual syntax to achieve the best -usability. - -The two syntaxes concern both queries and results, making {\MathQL} usable in -a distributed environment where query engines are implemented as stand-alone -components. This is because in this setting both queries and query results -must be exchanged by the system's components and thus need to be encoded in -clearly defined format. - -{\MathQL} provides a graph-oriented access to the {\RDF} metadata, based on -tree instantiation. -This approach has the advantage of providing an abstraction over the -concrete representation of the {\RDF} database (that can consist of {\RDF} -triples and {\XML} files simultaneously) at the user level, and this is -definitely desirable especially in a distributed context. - -{\MathQL} query results are meant to capture the structure of trees coming -from an {\RDF} graph and for this purpose a standard $1$- or $2$-dimensional -organization (as provided by most {\RDF}-oriented query languages) is not -satisfactory. Here {\MathQL} approach is to use a $4$-dimensional organization -for its query results. - -\subsubsection*{Post-processing and code generation capabilities} - -The {\MathQL} query engine, that is written in {\CAML} for an easy integration -with the {\HELM} software, provides two ways of processing the query results: -at {\CAML} side and natively. - -At {\CAML} side, an application issues a query calling a function of the -engine and manipulates the result either operating directly on its internal -representation (that is placed in the public scope), or using a set of -dedicated functions specifically designed to manage the query results. -This set of functions includes a basic library but is extensible depending -on the {\CAML} modules included in the engine at compile-time. In this way -an expert user can write a {\CAML} module with new dedicated functions and can -include it in the engine recompiling it. - -{\MathQL} supports native post-processing of the query results including the -standard constructions of an imperative Turing-complete programming language, -whose aim is definitely not that of being all-purpose (the user can work at -{\CAML} side for that), but of being optimized for the management of the -query results. -In this context an {\SQL}-like ``select-from-where'' construction is provided -(as required by the {\RDF} community) as well as a mechanism for accessing the -post-processing dedicated functions available to the engine. - -Moreover the language provides access to an extensible set of code-generating -functions (also available at {\CAML} side) that the expert user can define -writing suitable {\CAML} modules for the engine. -Note that the generated code is always {\MathQL} code. - -The code generation features allow to build complex queries incrementally and -in an automatic manner, as required by the needs of the {\HELM} project. -Using the native programming language, instead, queries can include the -post-processing algorithms on their results so the querying code and the -subsequent processing code (if needed) are treated together as a -self-contained object that can be computed by a single engine. -In this sense the alternative of performing a complex query on a remote -component issuing some {\MathQL} querying code followed by some {\CAML} -post-processing code is really infeasible in a distributed context. - -\subsubsection*{Physical organization of the RDF database} - -The implementation of the {\MathQL} query engine does not depend on any -software developed within the {\HELM} project, nor it depends on the {\HELM} -metadata model in any way. +\tableofcontents -However the engine does make few assumptions on the way metadata are -physically organized and needs some user-provided knowledge about the concrete -metadata representation. -Metadata stored as {\RDF} triples are accessed through a {\PostgreSQL} engine, -while metadata stored as {\RDF}/{\XML} files are accessed through a {\Galax} -{\XQuery} engine. +\input{mathql_overview} +\input{mathql_introduction} +\input{mathql_bib} \end{document} diff --git a/helm/mathql/doc/mathql_bib.tex b/helm/mathql/doc/mathql_bib.tex new file mode 100644 index 000000000..374048459 --- /dev/null +++ b/helm/mathql/doc/mathql_bib.tex @@ -0,0 +1,90 @@ +\begin{thebibliography}{99} + +\bibitem {APSCGS03} +A. Asperti, L. Padovani, C. Sacerdoti Coen, F.Guidi, I. Schena: +\emph{Mathematical Knowledge Management in HELM}. +In Annals of Mathematics and Artificial Intelligence 38(1), +Kluwer Academic Publishers (May 2003), pp. 27-46. + +\bibitem {Gui03} +F. Guidi: +\emph{Searching and Retrieving in Content-based Repositories of Formal +Mathematical Knowledge}. +Ph.D. Thesis in Computer Science, University of Bologna, March 2003. Technical +report UBLCS 2003-06. + +\bibitem {GSC03} +F. Guidi, C. Sacerdoti Coen: +\emph{Querying Distributed Digital Libraries of Mathematics}. +In Proc. of the 11th Symposium on the Integration of Symbolic Computation and +Mechanized Reasoning (Calculemus 2003). Rome, Italy, +September 2003. pages 17-30, Aracne. 2003. + +\bibitem {GS03} +F. Guidi, I. Schena: +\emph{A Query Language for a Metadata Framework about Mathematical Resources}. +In Proc. of 2nd International Conference on Mathematical Knowledge +Management (MKM 2003). +Bertinoro, Italy, February 2003. +LNCS 2594, pp. 105-118, Springer. 2003. + +\bibitem {Lan98} +C. Laneve: +\emph{La descrizione operazionale dei linguaggi di programmazione. \\ +Un'introduzione} +FrancoAngeli, 1998. + +\bibitem {Lor02} +D. Lordi: +\emph{Sperimentazione e Sviluppo di Strumenti per la gestione di metadati}. \\ +Master Thesis in Computer Science, University of Bologna, 2002. +Advisor: A. Asperti. + +\bibitem {RDF} +\emph{Resource Description Framework (RDF) Model and Syntax Specification}. +W3C Recommendation. February 22, 1999. \\ +\URI{http://www.w3.org/TR/1999/REC-rdfsyntax-19990222/}. + +\bibitem {RDFS} +\emph{RDF Vocabulary Description Language 1.0: RDF Schema}. +W3C Working Draft. January 23, 2003 +\URI{http://www.w3.org/TR/rdf-schema/}. + +\bibitem {Sam00} +G. Sambin: \emph{Formal topology and domains}. +Electronic Notes in Theoretical Computer Science, (35). 2000. + +\bibitem {Sch02} +I. Schena: +\emph{Towards a Semantic Web for Formal Mathematics}. +Ph.D. dissertation. University of Bologna, 2002. Advisor: A. Asperti. + +\bibitem {Unicode} +Unicode Consortium: +\emph{The Unicode Standard, Version 3.2}. March 2002. \\ +\URI{http://www.unicode.org/unicode/standard/standard.html}. + +\bibitem {URI} +\emph{Uniform Resource Identifiers (URI): Generic Syntax (RFC 2396)}. +August 1998. \\ +\URI{http://www.ietf.org/rfc/rfc2396.txt}. + +\bibitem {W3Ca} +\emph{Character Model for the World Wide Web 1.0}, +W3C Working Draft. April 30, 2002. +\URI{http://www.w3.org/TR/charmod/}. + +\bibitem {Win93} +G. Winskel: +\emph{The formal semantics of programming languages: an introduction}. +MIT Press Series in the Foundations of Computing. London: MIT Press, 1993. + +\bibitem {XML} +\emph{Extensible Markup Language (XML) 1.0 (Second Edition)}. +W3C Recommendation. October 6, 2000. \URI{http://www.w3.org/REC-xml}. + +\bibitem {XQuery} +\emph{XQuery 1.0: An XML Query Language}. +W3C Working Draft November 15, 2002. \URI{http://www.w3.org/TR/xquery/}. + +\end{thebibliography} diff --git a/helm/mathql/doc/mathql_introduction.tex b/helm/mathql/doc/mathql_introduction.tex new file mode 100644 index 000000000..d243ca901 --- /dev/null +++ b/helm/mathql/doc/mathql_introduction.tex @@ -0,0 +1,6 @@ +\section{Introduction} + +This paper presents {\MathQL} version 4 which is the latest version of the +language, fully developed by Ferruccio Guidi. +For a description of the previous versions of {\MathQL} see: \cite{Gui03} +(version 3), \cite{GS03} (version 2), \cite{Lor02} (version 1). diff --git a/helm/mathql/doc/mathql_overview.tex b/helm/mathql/doc/mathql_overview.tex new file mode 100644 index 000000000..d511c7aa1 --- /dev/null +++ b/helm/mathql/doc/mathql_overview.tex @@ -0,0 +1,156 @@ +\section{Overview} + +{\MathQL}% +\footnote{See \URI{http://helm.cs.unibo.it/mathql}.} +is a query language for {\RDF} \cite{RDF,RDFS} databases, developed in the +context of the {\HELM}% +\footnote{See \URI{http://helm.cs.unibo.it}.} +project \cite{APSCGS03}. +Its name suggests that it is supposed to be the first of a group of query +languages for retrieving information from distributed digital libraries of +formal mathematical knowledge, but no other languages of this proposal have +been implemented yet except for {\MathQL} that is not Mathematics-oriented. +So the name is a bit misleading. + +\xcomment { + +The MathQL proposal rises within the HELM project with the final aim of +providing a set of query languages for digital libraries of formalized +mathematical resources, capable of expressing content-aware requests. + +This proposal has several domains of application and may be useful for +database or on-line libraries reviewers, for proof assistants or +proof-checking systems, and also for learning environments because these +applications require features for classifying, searching and browsing +mathematical information in a semantically meaningful way. + +As the most natural way to handle content information about a resource is +by means of metadata, our first task is providing a query language that we +call MathQL level 1 (or {\MathQL} for short), suitable for a metadata +framework. +Other languages to be defined in the context of the MathQL proposal may be +suitable for queries about the semantic structure of mathematical data: +this includes content-based pattern-matching (MathQL-2) and possibly other +forms of formal matching involving for instance isomorphism, unification and +$\delta$-expansion% +\footnote{by $\delta$-expansion we mean the expansion of definitions.} +(MathQL-3). + +In this perspective the role of a query on metadata can be that of producing a +filtered knowledge base containing relevant information for subsequent queries +of other kind. + +} + +{\MathQL} is carefully designed for making up for two limitations that seem to +characterize several implementations and proposals of current {\RDF}-oriented +query languages, namely the insufficient compliance with the most requested +features and the poor attention paid to query result management. +Thus the language has the following design goals: + +\begin{enumerate} + +\item +compliance with the main requirements stated by the {\RDF} community; + +\item +native support for post-processing the query results; + +\item +{\HELM}-independent implementation of the query engine. + +\end{enumerate} + +We will briefly analyze these features in the remaining part of this +section. + +\subsubsection*{The main requirements from the RDF community} + +As a query language for {\RDF} databases, {\MathQL} has a well-conceived +semantics, defined in term of an abstract metadata model, according to which +queries return exhaustive solutions. +The language provides facilities for imposing query constraints based on +{\RDFS} \cite{RDFS} and for the traversal of compound values of properties. +It also provides a full set of Boolean operators to compose the query +constraints and facilities for selecting resources or literals by means of +{\POSIX} regular expressions. +Moreover the language allows to customize the query results specifying what +part of a solution should be preserved, and supports a machine-processable +{\XML} \cite{XML} syntax as well as a human-readable textual syntax to achieve +the best usability. +The two syntaxes concern both queries and results, making {\MathQL} usable in +a distributed environment where query engines are implemented as stand-alone +components. This is because in this setting both queries and query results +must be exchanged by the system's components and thus need to be encoded in +clearly defined format. + +{\MathQL} provides a graph-oriented access to the {\RDF} metadata, based on +tree instantiation. +This approach has the advantage of providing an abstraction over the +concrete representation of the {\RDF} database (that can consist of {\RDF} +triples and {\XML} files simultaneously) at the user level, and this is +definitely desirable especially in a distributed context. + +{\MathQL} query results are meant to capture the structure of trees coming +from an {\RDF} graph and for this purpose a standard $1$- or $2$-dimensional +organization (as provided by most {\RDF}-oriented query languages) is not +satisfactory. Here {\MathQL} approach is to use a $4$-dimensional organization +for its query results. + +\subsubsection*{Post-processing and code generation capabilities} + +The {\MathQL} query engine, that is written in {\CAML}% +\footnote{See \URI{http://caml.inria.fr}.} +for an easy integration with the {\HELM} software, provides two ways of +processing the query results: at {\CAML} side and natively. + +At {\CAML} side, an application issues a query calling a function of the +engine and manipulates the result either operating directly on its internal +representation (through a low-level interface), or using a set of dedicated +functions specifically designed to manage the query results. +This set of functions includes a basic library but is extensible depending +on the {\CAML} modules included in the engine at compile-time. In this way +an expert user can write a {\CAML} module with new dedicated functions and can +include it in the engine recompiling it. + +{\MathQL} supports native post-processing of the query results including the +standard constructions of an imperative Turing-complete programming language, +whose aim is definitely not that of being all-purpose (the user can work at +{\CAML} side for that), but of being optimized for the management of the +query results. +In this context an {\SQL}-like ``select-from-where'' construction is provided +(as required by the {\RDF} community) as well as a mechanism for accessing the +post-processing dedicated functions available to the engine. + +Moreover the language provides access to an extensible set of code-generating +functions (also available at {\CAML} side) that the expert user can define +writing suitable {\CAML} modules for the engine. +Note that the generated code is always {\MathQL} code. + +The code generation features allow to build complex queries incrementally and +in an automatic manner, as required by the needs of the {\HELM} project. +Using the native programming language, instead, queries can include the +post-processing algorithms on their results so the querying code and the +subsequent processing code (if needed) are treated together as a +self-contained object that can be computed by a single engine. +In this sense the alternative of performing a complex query on a remote +component issuing some {\MathQL} querying code followed by some {\CAML} +post-processing code is really infeasible in a distributed context. + +\subsubsection*{Physical organization of the RDF database} + +The implementation of the {\MathQL} query engine does not depend on any +software developed within the {\HELM} project, nor it depends on the {\HELM} +metadata model in any way. + +However the engine does make few assumptions on the way metadata are +physically organized and needs some user-provided knowledge about the concrete +metadata representation. +Metadata stored as {\RDF} triples are accessed through a {\MySQL}% +\footnote{See \URI{http://www.mysql.com}.} +or a {\PostgreSQL}% +\footnote{See \URI{http://www.postgresql.org}.} +engine, while metadata stored as {\RDF}/{\XML} files are accessed through a +{\Galax}% +\footnote{See \URI{http://db.bell-labs.com/galax/}.} +{\XQuery} \cite{XQuery} engine.