X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2Fmathql%2Fdoc%2Fmathql_overview.tex;h=45fd2cb99f366fe50ea2a38ada18660058cd6219;hb=4167cea65ca58897d1a3dbb81ff95de5074700cc;hp=d511c7aa1f3e405310cc201c23d1f321254485d1;hpb=1c84c6e1df257ad284a256ee0c2c1a203f81b713;p=helm.git diff --git a/helm/mathql/doc/mathql_overview.tex b/helm/mathql/doc/mathql_overview.tex index d511c7aa1..45fd2cb99 100644 --- a/helm/mathql/doc/mathql_overview.tex +++ b/helm/mathql/doc/mathql_overview.tex @@ -1,46 +1,31 @@ \section{Overview} {\MathQL}% -\footnote{See \URI{http://helm.cs.unibo.it/mathql}.} +\footnote{See \CURI{http://helm.cs.unibo.it/mathql}.} is a query language for {\RDF} \cite{RDF,RDFS} databases, developed in the context of the {\HELM}% -\footnote{See \URI{http://helm.cs.unibo.it}.} +\footnote{See \CURI{http://helm.cs.unibo.it}.} project \cite{APSCGS03}. Its name suggests that it is supposed to be the first of a group of query languages for retrieving information from distributed digital libraries of -formal mathematical knowledge, but no other languages of this proposal have -been implemented yet except for {\MathQL} that is not Mathematics-oriented. -So the name is a bit misleading. - -\xcomment { - -The MathQL proposal rises within the HELM project with the final aim of -providing a set of query languages for digital libraries of formalized -mathematical resources, capable of expressing content-aware requests. - +formal mathematical knowledge by means of content-aware requests, but no other +languages of this proposal have been implemented yet except for {\MathQL} that +is not Mathematics-oriented. So the name is a bit misleading. This proposal has several domains of application and may be useful for database or on-line libraries reviewers, for proof assistants or proof-checking systems, and also for learning environments because these applications require features for classifying, searching and browsing mathematical information in a semantically meaningful way. - -As the most natural way to handle content information about a resource is -by means of metadata, our first task is providing a query language that we -call MathQL level 1 (or {\MathQL} for short), suitable for a metadata -framework. Other languages to be defined in the context of the MathQL proposal may be suitable for queries about the semantic structure of mathematical data: -this includes content-based pattern-matching (MathQL-2) and possibly other -forms of formal matching involving for instance isomorphism, unification and +this includes content-based pattern-matching and possibly other forms of +formal matching involving for instance isomorphism, unification and $\delta$-expansion% -\footnote{by $\delta$-expansion we mean the expansion of definitions.} -(MathQL-3). - -In this perspective the role of a query on metadata can be that of producing a +\footnote{By $\delta$-expansion we mean the expansion of definitions.}. +In this perspective the role of a query on metadata is that of producing a filtered knowledge base containing relevant information for subsequent queries -of other kind. - -} +of other kind (see \cite{GSC03} for a more detailed description of this +approach). {\MathQL} is carefully designed for making up for two limitations that seem to characterize several implementations and proposals of current {\RDF}-oriented @@ -64,6 +49,8 @@ native support for post-processing the query results; We will briefly analyze these features in the remaining part of this section. +\vspace{-1pc} + \subsubsection*{The main requirements from the RDF community} As a query language for {\RDF} databases, {\MathQL} has a well-conceived @@ -80,9 +67,8 @@ part of a solution should be preserved, and supports a machine-processable the best usability. The two syntaxes concern both queries and results, making {\MathQL} usable in a distributed environment where query engines are implemented as stand-alone -components. This is because in this setting both queries and query results -must be exchanged by the system's components and thus need to be encoded in -clearly defined format. +components. In this setting in fact both the queries and their results must be +exchanged by the system's components and thus need to be clearly encoded. {\MathQL} provides a graph-oriented access to the {\RDF} metadata, based on tree instantiation. @@ -94,13 +80,15 @@ definitely desirable especially in a distributed context. {\MathQL} query results are meant to capture the structure of trees coming from an {\RDF} graph and for this purpose a standard $1$- or $2$-dimensional organization (as provided by most {\RDF}-oriented query languages) is not -satisfactory. Here {\MathQL} approach is to use a $4$-dimensional organization +satisfactory. {\MathQL} approach is to use a $4$-dimensional organization for its query results. +\vspace{-1pc} + \subsubsection*{Post-processing and code generation capabilities} The {\MathQL} query engine, that is written in {\CAML}% -\footnote{See \URI{http://caml.inria.fr}.} +\footnote{See \CURI{http://caml.inria.fr}.} for an easy integration with the {\HELM} software, provides two ways of processing the query results: at {\CAML} side and natively. @@ -126,7 +114,6 @@ Moreover the language provides access to an extensible set of code-generating functions (also available at {\CAML} side) that the expert user can define writing suitable {\CAML} modules for the engine. Note that the generated code is always {\MathQL} code. - The code generation features allow to build complex queries incrementally and in an automatic manner, as required by the needs of the {\HELM} project. Using the native programming language, instead, queries can include the @@ -137,6 +124,8 @@ In this sense the alternative of performing a complex query on a remote component issuing some {\MathQL} querying code followed by some {\CAML} post-processing code is really infeasible in a distributed context. +\vspace{-1pc} + \subsubsection*{Physical organization of the RDF database} The implementation of the {\MathQL} query engine does not depend on any @@ -147,10 +136,10 @@ However the engine does make few assumptions on the way metadata are physically organized and needs some user-provided knowledge about the concrete metadata representation. Metadata stored as {\RDF} triples are accessed through a {\MySQL}% -\footnote{See \URI{http://www.mysql.com}.} +\footnote{See \CURI{http://www.mysql.com}.} or a {\PostgreSQL}% -\footnote{See \URI{http://www.postgresql.org}.} +\footnote{See \CURI{http://www.postgresql.org}.} engine, while metadata stored as {\RDF}/{\XML} files are accessed through a {\Galax}% -\footnote{See \URI{http://db.bell-labs.com/galax/}.} +\footnote{See \CURI{http://db.bell-labs.com/galax/}.} {\XQuery} \cite{XQuery} engine.