X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;ds=sidebyside;f=helm%2Fmathql%2Fdoc%2Fmathql_overview.tex;fp=helm%2Fmathql%2Fdoc%2Fmathql_overview.tex;h=0000000000000000000000000000000000000000;hb=1696761e4b8576e8ed81caa905fd108717019226;hp=45fd2cb99f366fe50ea2a38ada18660058cd6219;hpb=5325734bc2e4927ed7ec146e35a6f0f2b49f50c1;p=helm.git diff --git a/helm/mathql/doc/mathql_overview.tex b/helm/mathql/doc/mathql_overview.tex deleted file mode 100644 index 45fd2cb99..000000000 --- a/helm/mathql/doc/mathql_overview.tex +++ /dev/null @@ -1,145 +0,0 @@ -\section{Overview} - -{\MathQL}% -\footnote{See \CURI{http://helm.cs.unibo.it/mathql}.} -is a query language for {\RDF} \cite{RDF,RDFS} databases, developed in the -context of the {\HELM}% -\footnote{See \CURI{http://helm.cs.unibo.it}.} -project \cite{APSCGS03}. -Its name suggests that it is supposed to be the first of a group of query -languages for retrieving information from distributed digital libraries of -formal mathematical knowledge by means of content-aware requests, but no other -languages of this proposal have been implemented yet except for {\MathQL} that -is not Mathematics-oriented. So the name is a bit misleading. -This proposal has several domains of application and may be useful for -database or on-line libraries reviewers, for proof assistants or -proof-checking systems, and also for learning environments because these -applications require features for classifying, searching and browsing -mathematical information in a semantically meaningful way. -Other languages to be defined in the context of the MathQL proposal may be -suitable for queries about the semantic structure of mathematical data: -this includes content-based pattern-matching and possibly other forms of -formal matching involving for instance isomorphism, unification and -$\delta$-expansion% -\footnote{By $\delta$-expansion we mean the expansion of definitions.}. -In this perspective the role of a query on metadata is that of producing a -filtered knowledge base containing relevant information for subsequent queries -of other kind (see \cite{GSC03} for a more detailed description of this -approach). - -{\MathQL} is carefully designed for making up for two limitations that seem to -characterize several implementations and proposals of current {\RDF}-oriented -query languages, namely the insufficient compliance with the most requested -features and the poor attention paid to query result management. -Thus the language has the following design goals: - -\begin{enumerate} - -\item -compliance with the main requirements stated by the {\RDF} community; - -\item -native support for post-processing the query results; - -\item -{\HELM}-independent implementation of the query engine. - -\end{enumerate} - -We will briefly analyze these features in the remaining part of this -section. - -\vspace{-1pc} - -\subsubsection*{The main requirements from the RDF community} - -As a query language for {\RDF} databases, {\MathQL} has a well-conceived -semantics, defined in term of an abstract metadata model, according to which -queries return exhaustive solutions. -The language provides facilities for imposing query constraints based on -{\RDFS} \cite{RDFS} and for the traversal of compound values of properties. -It also provides a full set of Boolean operators to compose the query -constraints and facilities for selecting resources or literals by means of -{\POSIX} regular expressions. -Moreover the language allows to customize the query results specifying what -part of a solution should be preserved, and supports a machine-processable -{\XML} \cite{XML} syntax as well as a human-readable textual syntax to achieve -the best usability. -The two syntaxes concern both queries and results, making {\MathQL} usable in -a distributed environment where query engines are implemented as stand-alone -components. In this setting in fact both the queries and their results must be -exchanged by the system's components and thus need to be clearly encoded. - -{\MathQL} provides a graph-oriented access to the {\RDF} metadata, based on -tree instantiation. -This approach has the advantage of providing an abstraction over the -concrete representation of the {\RDF} database (that can consist of {\RDF} -triples and {\XML} files simultaneously) at the user level, and this is -definitely desirable especially in a distributed context. - -{\MathQL} query results are meant to capture the structure of trees coming -from an {\RDF} graph and for this purpose a standard $1$- or $2$-dimensional -organization (as provided by most {\RDF}-oriented query languages) is not -satisfactory. {\MathQL} approach is to use a $4$-dimensional organization -for its query results. - -\vspace{-1pc} - -\subsubsection*{Post-processing and code generation capabilities} - -The {\MathQL} query engine, that is written in {\CAML}% -\footnote{See \CURI{http://caml.inria.fr}.} -for an easy integration with the {\HELM} software, provides two ways of -processing the query results: at {\CAML} side and natively. - -At {\CAML} side, an application issues a query calling a function of the -engine and manipulates the result either operating directly on its internal -representation (through a low-level interface), or using a set of dedicated -functions specifically designed to manage the query results. -This set of functions includes a basic library but is extensible depending -on the {\CAML} modules included in the engine at compile-time. In this way -an expert user can write a {\CAML} module with new dedicated functions and can -include it in the engine recompiling it. - -{\MathQL} supports native post-processing of the query results including the -standard constructions of an imperative Turing-complete programming language, -whose aim is definitely not that of being all-purpose (the user can work at -{\CAML} side for that), but of being optimized for the management of the -query results. -In this context an {\SQL}-like ``select-from-where'' construction is provided -(as required by the {\RDF} community) as well as a mechanism for accessing the -post-processing dedicated functions available to the engine. - -Moreover the language provides access to an extensible set of code-generating -functions (also available at {\CAML} side) that the expert user can define -writing suitable {\CAML} modules for the engine. -Note that the generated code is always {\MathQL} code. -The code generation features allow to build complex queries incrementally and -in an automatic manner, as required by the needs of the {\HELM} project. -Using the native programming language, instead, queries can include the -post-processing algorithms on their results so the querying code and the -subsequent processing code (if needed) are treated together as a -self-contained object that can be computed by a single engine. -In this sense the alternative of performing a complex query on a remote -component issuing some {\MathQL} querying code followed by some {\CAML} -post-processing code is really infeasible in a distributed context. - -\vspace{-1pc} - -\subsubsection*{Physical organization of the RDF database} - -The implementation of the {\MathQL} query engine does not depend on any -software developed within the {\HELM} project, nor it depends on the {\HELM} -metadata model in any way. - -However the engine does make few assumptions on the way metadata are -physically organized and needs some user-provided knowledge about the concrete -metadata representation. -Metadata stored as {\RDF} triples are accessed through a {\MySQL}% -\footnote{See \CURI{http://www.mysql.com}.} -or a {\PostgreSQL}% -\footnote{See \CURI{http://www.postgresql.org}.} -engine, while metadata stored as {\RDF}/{\XML} files are accessed through a -{\Galax}% -\footnote{See \CURI{http://db.bell-labs.com/galax/}.} -{\XQuery} \cite{XQuery} engine.