\title{MathQL-1 Version 4\\Reference Documentation}
-\author{Ferruccio Guidi}
+\author{Ferruccio Guidi%
+\thanks{This work has been partially supported by
+MoWGLI (European FET Project IST-2001-33562).}
+\institute{Department of Computer Science\\
+Mura Anteo Zamboni 7, 40127 Bologna, ITALY.\\
+\date{ }
D. Lordi:
\emph{Sperimentazione e Sviluppo di Strumenti per la gestione di metadati}. \\
Master Thesis in Computer Science, University of Bologna, 2002.
-Advisor: A. Asperti.
\bibitem {RDF}
\emph{Resource Description Framework (RDF) Model and Syntax Specification}.
+\section{The language}
This paper presents {\MathQL} version 4 which is the latest version of the
-language, fully developed by Ferruccio Guidi.
+language, fully developed by the Author.
For a description of the previous versions of {\MathQL} see: \cite{Gui03}
(version 3), \cite{GS03} (version 2), \cite{Lor02} (version 1).
The main novelties of this version are the elimination of some cast operators
-(producing a substantial simplification in the query structure and semantics,
-see \secref{Operational}), a clear distinction between the core language and
-the auxiliary functions of the basic library, a support for query generating
-functions, the possibility of extending the language adding new libraries of
-functions and a more uniform textual syntax.
+(producing a substantial simplification in the query structure and semantics),
+a clear distinction between the core language and the auxiliary functions of
+the basic library, a support for query generating functions, the possibility
+of extending the language adding new libraries of functions and a more uniform
+textual syntax.
{\MathQL}.4 incorporates the features of {\MathQL}.3 not documented on paper%
{See the ``what's new'' section of {\MathQL} Web Site:
-% \input{mathql_introduction_textual}
-\subsection {Sets of attributed values.}
+\subsection {Sets of attributed values.} \label{AVSets}
The data representation model used by {\MathQL} relies on the notion of
\emph{set of attributed values} ({\av} set for short) that is, in practice,
Each {\av} in an {\av} set consists of a string%
\footnote{When we say \emph{string}, we mean a finite sequence of characters.}
-(that we call the \emph{head string} or \emph{value}) and a (possibly emty)
+(that we call the \emph{head string} or \emph{value}) and a (possibly empty)
multiset of named attributes whose content is a set of strings.
Attribute names are made of a (possibly empty) list of string components, so
they can be hierarchically structured.
({\ie} subsets) to improve its structure.
In the above description a \emph{set} is an \emph{unordered} finite
-sequence \emph{without} repetitions wheras a \emph{multiset} is an
+sequence \emph{without} repetitions whereas a \emph{multiset} is an
\emph{unordered} finite sequence \emph{with} repetitions.
In the present context repetitions are defined as follows:
that were tried.}
As we said, {\MathQL}.4 uses {\av} sets to represent many kinds of
-information, namely:
Moreover structured attribute names can encode various components of
structured properties preserving their semantics.
\begin{footnotesize} \begin{verbatim}
The RDF triples:
-("http://www.w3.org/2002/01/rdf-databases/protocol", "dc:creator", "Sandro Hawke")
-("http://www.w3.org/2002/01/rdf-databases/protocol", "dc:creator", "Eric Prud'hommeaux")
-("http://www.w3.org/2002/01/rdf-databases/protocol", "dc:date", "2002-01-08")
+ ("protocol", "dc:creator", "Sandro Hawke")
+ ("protocol", "dc:creator", "Eric Prud'hommeaux")
+ ("protocol", "dc:date", "2002-01-08")
The corresponding attributed value:
-"http://www.w3.org/2002/01/rdf-databases/protocol" attr
- {"dc:creator" = {"Sandro Hawke", "Eric Prud'hommeaux"}; "dc:date" = "2002-01-08"}
+ "protocol" attr {/"dc:creator" = {"Sandro Hawke", "Eric Prud'hommeaux"};
+ /"dc:date" = "2002-01-08"}
\end{verbatim} \end{footnotesize}
\caption{The representation of a pool of {\RDF} triples} \label{AVOne}
\figref{AVOne} shows how a set of triples can be coded in an {\av}.
-Note that the word \emph{attr} separates the head string from its attributes,
+Note that the word \TT{attr} separates the head string from its attributes,
braces enclose an attribute group in which attributes are separated by
-semicolons, and an equal sign separates an attribute name from its contents
-(see \subsecref{Textual} for the complete {\av} syntax).
+semicolons, and an equal sign separates an attribute name from its contents.
In this setting the grouping feature can be used to separate semantically
different classes of properties associated to a resource (as for instance
A pool of arbitrarily chosen {\RDF} triples is encoded in an {\av} set
-placing different {\av}'s the subset of triples sharing the same subject.
+placing in each {\av} the subset of triples sharing the same head string.
Note that the use of {\av} sets to build query results allows {\MathQL} queries
to return sets of {\RDF} triples instead of mere sets of resources, in the
\figref{Table} shows an {\av} set describing the properties of two resources
``A'' and ``B'' giving its table representation, in which the columns
corresponding to attributes in the same group are clustered between
-double-line delimiters%
+double-line delimiters.%
\footnote{A table with grouped labelled columns like the one above resembles a
-set of relational database tables.}.
+set of relational database tables.}
-%Another possible use of a {\MathQL} query result is for the encoding of a
-%relational database table: in this sense the indexed column is stored in the
-%subject strings, the names of the other columns are stored in attribute names
-%and cell contents are stored in attribute values.
\begin{footnotesize} \begin{verbatim}
-"A" attr {"major" = "1"; "minor" = "2"}, {"first" = "2002-01-01"; "modified" = "2002-03-01"};
-"B" attr {"major" = "1"; "minor" = "7"}, {"first" = "2002-02-01"; "modified" = "2002-04-01"}
+"A" attr {/"major" = "1"; /"minor" = "2"},
+ {/"first" = "2002-01-01"; /"modified" = "2002-03-01"};
+"B" attr {/"major" = "1"; /"minor" = "7"},
+ {/"first" = "2002-02-01"; /"modified" = "2002-04-01"}
\begin{center} \begin{tabular}{|c||c|c||c|c||}
-\hline & {\bf ``major''} & {\bf ``minor''} & {\bf ``first''} & {\bf ``modified''} \\
+\hline & \textbf{``major''} & \textbf{``minor''} & \textbf{``first''} & \textbf{``modified''} \\
\hline ``A'' & ``1'' & ``2'' & ``2002-01-01'' & ``2002-03-01'' \\
\hline ``B'' & ``1'' & ``7'' & ``2002-02-01'' & ``2002-04-01'' \\
the set of the head strings (dimension 1), the attributes in each group
(dimension 2), the groups in each {\av} (dimension 3) and the contents of each
attribute (dimension 4).
The metadata defined in the table of \figref{Table} will be used in subsequent
-For this purpose assume that \TT{first} and \TT{modified} are the components
-of a structured property \TT{date} available for the resources ``A'' and ``B''.
+For this purpose assume that ``first'' and ``modified'' are the components
+of a structured property ``date'' available for the resources ``A'' and ``B''.
-The value of an {\RDF} property is encoded in a single {\av} distinguishing
-three situations:
+The value of an {\RDF} property is encoded in an {\av} distinguishing three
other components are stored in the {\av} attributes as in the case 1.
-If the property is structured and its value does not have a main component,
-the {\av} head string is empty and the components are stored in the
+For the value of a structured property without a main component, the head
+string is empty and the components are stored in the attributes.
\begin{footnotesize} \begin{verbatim}
First example, one instance:
-"" attr {"major" = "1"; "minor" = "2"}; no main component
-"1" attr {"minor" = "2"}; main component is "major"
-"2" attr {"major" = "1"} main component is "minor"
+ "" attr {/"major" = "1"; /"minor" = "2"} no main component
+ "1" attr {/"minor" = "2"} main component is "major"
+ "2" attr {/"major" = "1"} main component is "minor"
Second example: two separate instances:
-"" attr {"major" = "1"; "minor" = "2"}, {"major" = "1"; "minor" = "7"}; no main component
-"1" attr {"minor" = "2"}, {"minor" = "7"} main component is "major"
+ "" attr {/"major" = "1"; /"minor" = "2"},
+ {/"major" = "1"; /"minor" = "7"} no main component
+ "1" attr {/"minor" = "2"}, {/"minor" = "7"} main component is "major"
Third example: two mixed instances:
-"" attr {"major" = "3", "6"; "minor" = "4", "9"} no main component
+ "" attr {/"major" = "3", "6"; /"minor" = {"4", "9"}} no main component
\end{verbatim} \end{footnotesize}
\caption{The representation of the structured value of a property}
\figref{AVTwo} (first example) shows three possible ways of representing in
-{\av}'s an instance of a structured property \TT{id} whose value has two
-fields ({\ie} properties) \TT{major} and \TT{minor}.
-In this instance, \TT{major} is set to ``1'' and \TT{minor} is set to ``2''.
-The representations depend on which component of \TT{id} is chosen as the
-main component (none, \TT{major} or \TT{minor} respectively).
+{\av}'s an instance of a structured property ``id'' whose value has two
+fields ({\ie} properties) ``major'' and ``minor''.
+In this instance, ``major'' is set to ``1'' and ``minor'' is set to ``2''.
+The representations depend on which component of ``id'' is chosen as the
+main component (none, ``major'' or ``minor'' respectively).
Several structured property values sharing a common main component can be
encodes in a single {\av} exploiting the grouping facility: in this case the
attributes of every instance are enclosed in separate groups.
\figref{AVTwo} (second example) shows the representations of two instances of
-\TT{id}: the previous one and a new one for which \TT{major} is ``1'' and
-\TT{minor} is ``7''.
+``id'': the former and a new one for which ``major'' is ``1'' and ``minor'' is
Note that if the attributes of the two groups are encoded in a single group,
the notion of which components belong to the same property value can not be
recovered in the general case because the values of an attribute form a set
-and thus are unordered. \newline
-As an example think of two instances of \TT{id} encoded as in \figref{AVTwo}
+and thus are unordered.
+As an example think of two instances of ``id'' encoded as in \figref{AVTwo}
(third example).
-{\MathQL} defines five binary operations on {\av} sets: two unions, two
+{\MathQL} defines five core binary operations on {\av} sets: two unions, two
intersections and a difference. The first four are defined in terms of an
operation, that we call \emph{addition}, involving two {\av}'s with the same
head string.
-With the \emph{set-theoretic} addition, the set of attribute groups in the
+with the \emph{set-theoretic} addition, the set of attribute groups in the
resulting {\av} is the set-theoretic union of the sets of attribute groups in
-the operands.
+the operands;
-With the \emph{distributive} addition, the set of attribute groups in the
+with the \emph{distributive} addition, the set of attribute groups in the
resulting {\av} is the ``Cartesian product'' of the sets of attribute groups
in the two operands.
-In this context, an element of the ``Cartesian product'' is not a pair of
-groups but it is the set-theoretic union of these groups where the contents of
-homonymous attributes are clustered together using set-theoretic unions.
+Here an element of the ``Cartesian product'' is not a pair of groups but it is
+the set-theoretic union of these groups where the contents of homonymous
+attributes are clustered together using set-theoretic unions.
\figref{Addition} shows an example of the two kinds of addition.
\begin{footnotesize} \begin{verbatim}
Attributed values used as operands for the addition:
-"1" attr {"A" = "a"}, {"B" = "b1"}
-"1" attr {"A" = "a"}, {"B" = "b2"}
+ "1" attr {/"A" = "a"}, {/"B" = "b1"}
+ "1" attr {/"A" = "a"}, {/"B" = "b2"}
Set-theoretic addition:
-"1" attr {"A" = "a"}, {"B" = "b1"}, {"B" = "b2"}
+" 1" attr {/"A" = "a"}, {/"B" = "b1"}, {/"B" = "b2"}
Distributive addition:
-"1" attr {"A" = "a"}, {"A" = "a"; "B" = "b2"}, {"B" = "b1"; "A" = "a"}, {"B" = {"b1", "b2"}}
+ "1" attr {/"A" = "a"}, {/"A" = "a"; /"B" = "b2"},
+ {/"B" = "b1"; /"A" = "a"}, {/"B" = {"b1", "b2"}}
\end{verbatim} \end{footnotesize}
\caption{The addition of attributed values}
-Now we can discuss the five operations between {\av} sets that we mentioned
+Now we can discuss the five operations between {\av} sets:
-The two unions ocorresponds to the set-theoretic union of their operand where
-the {\av}'s sharing the head string are are added either set-theoretically or
+The two unions corresponds to the set-theoretic union of their operand where
+the {\av}'s sharing the head string are added either set-theoretically or
distributively as explained above (thus we have a set-theoretic union and a
distributive union in the two cases). In this context the empty {\av} set
plays the role of the neutral element.
The two intersections are the dual of the above unions: they contain the
-{\av}'s whose head string appears in each argument where {\av}'s sharing the
-head string are added either set-theoretically or distributively as before.
+{\av}'s whose head string appears in each argument where the {\av}'s sharing
+the head string are added either set-theoretically or distributively as before.
The distributive intersection has the double benefit of filtering the
common values of the given {\av} sets, and of merging their attribute groups
\figref{Binary} shows how the above operations work in a simple example.
\begin{footnotesize} \begin{verbatim}
Sets of attributed values used as operands for the operations:
-"1" attr {"A" = "a"}; "2" attr {"B" = "b1"}
-"2" attr {"B" = "b2"}
+ "1" attr {/"A" = "a"}; "2" attr {/"B" = "b1"}
+ "2" attr {/"B" = "b2"}
Set-theoretic union:
-"1" attr {"A" = "a"}; "2" attr {"B" = "b1"}, {"B" = "b2"}
+ "1" attr {/"A" = "a"}; "2" attr {/"B" = "b1"}, {/"B" = "b2"}
Distributive union:
-"1" attr {"A" = "a"}; "2" attr {"B" = {"b1", "b2"}}
+ "1" attr {/"A" = "a"}; "2" attr {/"B" = {"b1", "b2"}}
Set-theoretic intersection:
-"2" attr {"B" = "b1"}, {"B" = "b2"}
+ "2" attr {/"B" = "b1"}, {/"B" = "b2"}
Distributive intersection:
-"2" attr {"B" = {"b1", "b2"}}
+ "2" attr {/"B" = {"b1", "b2"}}
-"1" attr {"A" = "a"}
+ "1" attr {/"A" = "a"}
\end{verbatim} \end{footnotesize}
\caption{The binary operations on sets of attributed values}
--- /dev/null
+\subsection{The basic library} \label{Basic}
+The present paper leaves us too little space to present a complete
+description of {\MathQL}.4 basic library, so we only give a glance to the
+features it provides.
+For the user convenience {\MathQL}.4 includes a syntax extension for all the
+basic library functions, in order to hide the actual function invocation.
+Here are some of the provided constructions:
+\textbf{Aliases for commonly used constant {\av} sets.}
+\EM{empty}, \EM{false}, \EM{true}.
+\textbf{Conditional operator.}
+\TT{if} \EM{av-set} \TT{then} \EM{av-set} \TT{else} \EM{av-set}.
+Tests the first {\av} set for inhabitance and evaluates one of the other {\av}
+sets accordingly.
+\textbf{Standard \emph{select} clause.}
+\TT{select @}\EM{variable} \TT{from} \EM{av-set-1} \TT{where} \EM{av-set-2}.
+It is:
+\TT{for @}\EM{variable} \TT{in} \EM{av-set-1} \TT{sup}
+\TT{if} \EM{av-set-2} \TT{then @}\EM{variable} \TT{else empty}.
+\textbf{Set refinement}. The operator
+\TT{keep} \EM{optional-flag} \EM{name-list} \TT{in} \EM{av-set}
+removes from its argument every attribute whose name is included (or is not,
+according to the \EM{flag}) in the given \EM{name-list}.
+If the \EM{flag} is not present, the \EM{name-list} specifies the attributes
+to keep, whereas if the \EM{flag} is \TT{allbut}, the \EM{name-list} specifies
+the attributes to remove.
+Removing unwanted information from an {\av} set is useful in two cases: it
+lowers the complexity of intermediate query results increasing the performance
+of subsequent operations and it cleans the final query results making them
+easier to manage for the application that submits the query.%
+{Interpreting {\av} sets as relational database tables, this functionality
+allows to select the columns a table is made of, as with the {\SQL}
+\emph{select} operator.}
+The operator
+\TT{proj} \EM{name} \TT{of} \EM{av-set} makes the set-theoretic union of the
+contents that the specified attribute has in each group of the given {\av} set.
+Each element of this union then becomes the head string of an {\av} without
+attributes and the set of these is returned.%
+{This is the content of a labelled column of the given \EM{av-set} viewed as a
+See \figref{Proj}.
+\begin{footnotesize} \begin{verbatim}
+proj /"name" of ["1" attr {\"name" = {"a", "b"}}, {\"name" = {"b", "c"}}
+gives "a"; "b"; "b"
+\end{verbatim} \end{footnotesize}
+\caption{A simple projection} \label{Proj}
+The construction \TT{keep} \EM{av-set} is also provided to remove all the
+attributes in the given \EM{av-set} ({\ie} the list of the attributes to keep
+is empty).%
+{This is the content of the first column of the \EM{av-set} viewed as a table.}
+\textbf{Core operations on {\av} sets.}
+\EM{av-set} \EM{core-operator} \EM{av-set}
+returns an {\av} set composing the two operands according to the specified
+\EM{core-operator} (see \subsecref{AVSets}) which can be \TT{union}
+(set-theoretic union), \TT{intersect} (set-theoretic intersection) or
+\TT{diff} (difference).
+\TT{union} and \TT{intersect} are also provided in their $n$-ary form
+($ n \ge 1 $ for \TT{intersect}) and the $n$-ary union has the syntax
+\verb+{+ \EM{av-set} \TT{,} $\cdots$ \TT{,} \EM{av-set} \verb+}+.
+\textbf{Logical operations on {\av} sets.}
+\EM{and}, \EM{or}, \EM{xor}, \EM{not}.
+They are inspired by the C-style Boolean operators defined for the
+integer numbers. In particular:
+\TT{not} \EM{av-set}:
+returns \emph{false} if the \EM{av-set} is inhabited, or \emph{true} otherwise.
+\EM{av-set-1} \TT{and} \EM{av-set-2}:
+gives \EM{av-set-2} if \EM{av-set-1} is inhabited, or \emph{false} otherwise.
+\EM{av-set-1} \TT{or} \EM{av-set-2}:
+returns \EM{av-set-1} if it is inhabited, or \EM{av-set-2} otherwise.
+\EM{av-set-1} \TT{xor} \EM{av-set-2}:
+gives \emph{false} if both av-sets are inhabited or empty, or the inhabited
+\EM{av-set} otherwise.
+\TT{and} and \TT{or} are also available in their $n$-ary form.
+\textbf{Comparisons between {\av} sets.}
+\EM{av-set} \EM{test-operator} \EM{av-set}.
+Following the repetition rules of {\av} sets presented in \subsecref{AVSets},
+these operators work just on the head strings of their arguments and
+discard the attributes. All of them return \emph{false} or \emph{true}
+according to the outcome of the respective test.
+The \emph{test-operator} includes: \TT{sub} (set-theoretic subset relation),
+\TT{eq} (set-theoretic quality), \TT{meet} (inhabitance of the set-theoretic
+intersection), \TT{le} (numeric less-or-equal-than), \TT{lt} (numeric
+\footnote{\TT{le} and \TT{lt} return \emph{false} if their operands are invalid
+Note that the set-theoretic ``meet'' operator
+({\ie} $ V \meet W \equiv (\lex v \in V)\ v \in W $)
+is the natural companion of the corresponding ``sub'' operator
+({\ie} $ V \sub W \equiv (\lall v \in V)\ v \in W $) being its logical dual
+and is already being used successfully in the context of a constructive
+({\ie} intuitionistic and predicative) approach to point-free topology
+(see \cite{Sam00} for details).
+\textbf{Cardinality of an {\av} set.}
+This information is retrieved by the operator \TT{count} \EM{av-set} that
+returns an {\av} set representing a natural number.
{\MathQL}.4 consists of a core language and of a basic library. Other
user-defined libraries can be added at will. The core language includes the
\TT{property} operator mentioned in \subsecref{HighAccess} that queries the
-underlying {\RDF} database and the infrastructure to post-process the
+underlying {\RDF} database, and the infrastructure to post-process the
query results. The components of this infrastructure are listed below:
-Explicit sets of attributed values.
+\textbf{Explicit sets of attributed values.}
An explicit {\av} set can be placed in a query in two forms:
-as a single quoted string, like \verb+"this is a query result"+, that
-evaluates in a single {\av} with that value and no attributes, or as a full
-{\av} set in the syntax shown in the previous sections but sorrounded by
-square brackets, like \verb+["head" attr {"attribute-name" = "contents"}]+.
-In the second form, the contents of an attribute can be the result of a query,
-\verb+["head" attr {"attribute-name" = property /"metadata" of "resource"}]+.
+as a quoted string, like \verb+"this is a query result"+, that evaluates in a
+single {\av} with that value and no attributes, or as a full {\av} set in the
+syntax shown in the previous sections but surrounded by square brackets, like
+\verb+["head" attr {/"attribute" = "contents"}]+.
-In this case the contents of the attribute are the head strings of the query
-result, whose attributes (if any) are discarded.
+In the second form, the contents of an attribute can be the result of a query
+and in this case the contents of the attribute are the head strings of the
+query result, whose attributes (if any) are discarded.
-Variable assignment.
+\verb+["head" attr {/"attribute" = property /"metadata" of "resource"}]+
+\textbf{Variable assignment.}
Variables for {\av} sets (preceded by a \TT{\$} sign and called
\emph{set variables}) can be assigned using a standard \emph{let-in}
construction and may appear wherever an {\av} set ({\ie} a query result) is
The assignment has the form:
\TT{let \$}\EM{variable} \TT{=} \EM{av-set} \TT{in} \EM{av-set}
so we can write:
-\verb+let $var = "contents" in ["head" attr {"attribute-name" = $var}]+.
+\verb+let $var = "contents" in ["head" attr {/"attribute" = $var}]+.
-The scope rules of {\MathQL} variables are tipical for an imperative
-programming language and any case of assignment propagation will be indicated.
+The scope of {\MathQL} variables is typical for an imperative programming
+language and any case of assignment propagation will be indicated.
-Sequential composition.
+\textbf{Sequential composition.}
This construction has the form: \EM{av-set} \TT{;;} \EM{av-set} and works as
follows: the two {\av} sets are evaluated one after the other and the first
one is discarded but the variables assigned in the first {\av} set are
available to the second one.
-Unbounded iteration.
+\textbf{Unbounded iteration.}
This construction comes in two forms:
-\TT{while} \EM{av-set} \TT{sup} \EM{av-set}:
-iterates the evaluation of the second {\av} set until the first {\av set} is
-empty and returns the {\MathQL} set-theoretic union of all the evaluiations
-of the second {\av set}.
-\TT{while} \EM{av-set} \TT{inf} \EM{av-set}:
-like the former but set-theoretic intersection is used instead of
+\TT{while} \EM{av-set-1} \TT{sup} \EM{av-set-2}:
+iterates the evaluation of \EM{av-set-2} until \EM{av-set-1} is empty and
+returns the {\MathQL} set-theoretic union of all the evaluations of
+\TT{while} \EM{av-set-1} \TT{inf} \EM{av-set-2}:
+like the former but the set-theoretic intersection is used instead of the
set-theoretic union.
-In order for \TT{while} to work as expected, both {\av} sets are evaluaed in
+In order for \TT{while} to work as expected, both {\av} sets are evaluated in
a common context during the iteration ({\ie} the variables defined in both
-are alailable to both) and this context is also propagated outside the
-\TT{while} for convenience.
+are available to both) and this context is also propagated outside the
-Bounded iteration.
+\textbf{Bounded iteration.}
Also this construction comes in two forms:
\TT{for @}\EM{variable} \TT{in} \EM{av-set} \TT{sup} \EM{av-set}:
iterates the evaluation of the second \EM{av-set} assigning the \EM{variable}
to each element in the first \EM{av-set} and builds the {\MathQL}
set-theoretic union of the obtained results.
\TT{for @}\EM{variable} \TT{in} \EM{av-set} \TT{inf} \EM{av-set}:
-like the former but set-theoretic intersection is used instead of
+like the former but the set-theoretic intersection is used instead of the
set-theoretic union.
The variables for attributed values (preceded by a \TT{@} sign and called
\emph{element variables}) may appear wherever an {\av} set is allowed and
and in some additional places.
The element variables are kept distinct from the set variables (therefore
\TT{\$variable} and \TT{@variable} may appear in the same query without
Concerning the scope rules used in these constructions, the variables
assigned by the first {\av} set are available to the second {\av} set during
the iteration and the variables assigned by both {\av} sets are available
outside the \TT{for} as in the previous case.
-Addition of groups.
+\textbf{Addition of groups.}
\TT{add} \EM{optional-flag} \EM{attribute-groups} \TT{in} \EM{av-set}
builds an {\av} set adding the specified \EM{attribute-groups} to each element
of the given {\av} set.
If no \EM{flag} is specified the addition is set-theoretic, whereas with the
\figref{Add} shows how to build a one-element {\av} set using \TT{add}.
\begin{footnotesize} \begin{verbatim}
The set of attributed values given explicitly:
-["head" attr {"attribute-name" = property /"metadata" of "resource"}]
+ ["head" attr {/"attribute" = property /"metadata" of "resource"}]
-The same set built with the add operator
-add {"attribute-name" = property /"metadata" of "resource"} in "head"
+The same set built with the add operator:
+ add {/"attribute" = property /"metadata" of "resource"} in "head"
\end{verbatim} \end{footnotesize}
\caption{A simple use of the add operator}
-Existential test.
+\textbf{Existential test.}
The existential test has the form \TT{ex} \EM{av-set} where the
specification of the {\av} set contains some instances of the construction
\TT{@}\EM{variable}\TT{.}\EM{attribute-name}, and runs as follows:
with the contents of \EM{attribute-name} in an attribute group of the {\av}
stored in \TT{@}\EM{variable} and the evaluation is repeated for every
-possible choice of the groups (recall that different groups are allowed to
+possible choice of these groups (recall that different groups are allowed to
contain attributes with the same name). If one evaluation gives a non empty
-result, the defaut representation of \emph{true} is returned (the test
-succeded) in the other case the empty {\av} set, {\ie} \emph{false}, is
-returned (the test failed).
+result, the default representation of \emph{true} is returned, in the other
+case the empty {\av} set, {\ie} \emph{false}, is returned.
-Function invocation:
+\textbf{Function invocation.}
The core language allows to invoke two kinds of external functions (with
which a language extension may be provided): the functions of the first kind
return an {\av} set, the functions of the second kind return a piece of
\EM{function-name} \verb+{+
\EM{name} \TT{,} $\cdots$ \TT{,} \EM{name} \verb+} {+
\EM{av-set} \TT{,} $\cdots$ \TT{,} \EM{av-set} \verb+}+
-invokes the specified function of the firs kind on the given arguments and
-returns it result. The \EM{name} argument are {\MathQL} paths and usually
+invokes a function of the first kind on the given arguments and returns its
+result. The \EM{name} arguments are {\MathQL} paths and usually
represent attribute names.
\TT{gen} \EM{function-name} \verb+{+
\EM{av-set} \TT{,} $\cdots$ \TT{,} \EM{av-set} \verb+}+
-invokes the specified function of the second kind on the given arguments and
-replaces the function invocation with its result.
+invokes a function of the second kind on the given arguments and replaces
+itself with the function result.
-The function names are {\MathQL} paths exactly as the attribute nanes and the
+The function names are {\MathQL} paths exactly as the attribute names and the
graph paths used by the \TT{property} operator. The names of the two kinds of
functions are kept in distinct environments so they do not clash.
{\MathQL}.4 comes with a basic library of functions of the first kind
-(see \subsecref{IBasic}) that integrate the core language providing several
-facilities to the user.
+(see \subsecref{Basic}) that integrate the core language providing several
\subsection{High level access to metadata} \label{HighAccess}
{\MathQL} high level access to an {\RDF} database is \emph{graph-oriented} and
-is delegated to its \TT{property} operator, that formally accesses an {\RDF}
-{When we say {\RDF} graph, we actually mean both the {\RDFM} graph and the
-{\RDFS} graph.}
-through an \emph{access relation} which is better understood by explaining
-the informal semantics of the operator itself.
-This operator builds a \emph{result} {\av} set starting from two mandatory
-arguments: the \emph{source} {\av} set and the \emph{head path}.
+is delegated to its \TT{property} operator that builds a \emph{result} {\av}
+set starting from two mandatory arguments: the \emph{source} {\av} set and the
+\emph{head path}.
Other optional arguments may be used to change its default behaviour or to
request advanced functionalities.
-Its textual syntax is (see \subsecref{Textual}):
+This operator has the following syntax, where a path has the structure of an
+attribute name ({\ie} a list of strings) and denotes a (possibly empty) finite
+sequence of contiguous arcs (describing properties in the {\RDF} graph%
+{When we say \emph{{\RDF} graph}, we actually mean both the {\RDFM} graph and
+the {\RDFS} graph.}%
\TT{property} \EM{optional-flags} \EM{head-path} \EM{optional-clauses} \TT{of}
\EM{optional-flag} \EM{av-set}
-A path has the structure of an attribute name ({\ie} a list of strings) and
-denotes a (possibly empty) finite sequence of contiguous arcs (describing
-properties in the {\RDF} graph).
\begin{footnotesize} \begin{verbatim}
-These examples refer to the resources "A" and "B" of Figure 2.
+ These examples refer to the resources "A" and "B" of Figure 2.
Example 1: reading an unstructured property - simple case:
-property "id"/"major" of {"A", "B"} returns "1"
-property "id"/"minor" of {"A", "B"} returns "2"; "7"
+ property /"id"/"major" of {"A", "B"} gives "1"
+ property /"id"/"minor" of {"A", "B"} gives "2"; "7"
Example 2: reading an unstructured property - use of pattern:
-property "id"/"minor" of pattern ".*" returns "2"; "7"
+ property /"id"/"minor" of pattern ".*" gives "2"; "7"
Example 3: reading a structured property without main component:
-property "id" attr "major", "minor" of {"A", "B"}
-generates the following attributed values:
-"" attr {"major" = "1"; "minor" = "2"}; "" attr {"major" = "1"; "minor" = "7"}
-that are composed using MathQL-1 set-theoretic union giving the one-element set:
-"" attr {"major" = "1"; "minor" = "2"}, {"major" = "1"; "minor" = "7"}
+ property /"id" attr /"major", /"minor" of {"A", "B"}
+ generates the following attributed values:
+ "" attr {/"major" = "1"; /"minor" = "2"};
+ "" attr {/"major" = "1"; /"minor" = "7"}
+ that are composed with the set-theoretic union giving:
+ "" attr {/"major" = "1"; /"minor" = "2"},
+ {/"major" = "1"; /"minor" = "7"}
Example 4: reading a structured property specifying a main component:
-property "id" main "major" attr "minor" of {"A", "B"} gives
-"1" attr {"minor" = "2"}, {"minor" = "7"}
+ property /"id" main /"major" attr /"minor" of {"A", "B"} gives
+ "1" attr {/"minor" = "2"}, {/"minor" = "7"}
Example 5: the renaming mechanism:
-property "id" attr "minor" as "new-name" of {"A", "B"} gives
-"" attr {"new-name" = "2"}, {"new-name" = "7"}
+ property /"id" attr /"minor" as /"new-name" of {"A", "B"} gives
+ "" attr {/"new-name" = "2"}, {/"new-name" = "7"}
Example 6: imposing constraints on property values:
-property "date" istrue "first" in "2002-01-01" attr "modified" of {"A", "B"} and
-property "date" istrue "first" match ".*01.*" attr "modified" of {"A", "B"} give
-"" attr {"modified" = "2002-03-01"}
-Only the instance of "date" with "first" set to "2002-01-01" is considered.
+ property /"date" istrue /"first" in "2002-01-01"
+ attr /"modified" of {"A", "B"} and
+ property /"date" istrue /"first" match ".*01.*"
+ attr /"modified" of {"A", "B"} give
+ "" attr {/"modified" = "2002-03-01"}
+ Only the instance of "date" with "first" set to "2002-01-01"
+ is considered.
Example 7: inverse traversal of the head path:
-property inverse "date" attr "first" in subj "" gives
-"A" attr {"first" = "2002-01-01"}; "B" attr {"first" = "2002-02-01"}
-Example 8: some triples of an access relation:
-The triples formalizing the property "date" of the resource "A":
-("A", "date", "");
-("A", "date"/"first", "2002-01-01"); ("A", "date"/"modified", "2002-03-01")
+property inverse /"date" attr /"first" in "" gives
+"A" attr {/"first" = "2002-01-01"}; "B" attr {/"first" = "2002-02-01"}
\end{verbatim} \end{footnotesize}
\caption{The ``property'' operator}
source {\av} set (call them source resources) as start-nodes.
-The computation gives a set of nodes in the {\RDF} graph ({\ie} the end-nodes
-of the instantiated paths) which are the values of the instances of the
-(possibly compound) property specified by the path and concerning the source
+The computation gives a set of nodes ({\ie} the end-nodes of the instantiated
+paths) which are the values of the instances of the (possibly compound)
+property specified by the path and concerning the source resources.
These values, encoded into {\av}'s as explained above, are composed by means
specification overrides the default setting inferred from the {\RDF} graph
through the \emph{rdf:value} property) and the list of the value's secondary
components in the \TT{attr} \EM{optional-clause}.
Note that if a secondary component is not listed in the \TT{attr} clause, it
will not be read.
Also recall that, when the result {\av}'s are formed, the main component is
Note that the name of an attribute, which by default is its defining path in
the \TT{attr} clause, can be changed with an optional \TT{as} clause for the
user's convenience. See for instance \figref{Property} (example 5).
-The alternative could be a simple string but it needs to be a path for typing
-reasons. In any case a string can be seen as a one-element path.
+Note that the assigned name must be a path for typing reasons.
+The alternative could be to use a simple string but in any case a string can
+be seen as a one-element path.
In the default case \TT{property} builds its result considering every
component of the {\RDFM} graph ({\ie} every {\RDFM}) but we can constrain
a path in the {\RDF} graph starting from the end-node of the head path.
\TT{property} allows to access the {\RDFS} property hierarchy by specifying
-a flag named \TT{sub} or \TT{super}.
+a flag \TT{sub} or \TT{super}.
If the \TT{sub} flag is present, \TT{property} inspects the instances of the
default tree (made by the head path and by the \EM{optional-clauses} paths)
and every other tree obtained by substituting an arc $ p $ with the arc of a
It encodes the resources corresponding to the instances of the start-nodes into
{\av}'s assigning the attributes obtained instantiating the attribute paths%
-\footnote{The path in \EM{optional-clauses} are never traversed backward.}
-and composes these {\av}'s using the {\MathQL} set-theoretic union to build
-the result set.
+\footnote{The paths in the \EM{optional-clauses} are never traversed backward.}
+and builds the result set composing these {\av}'s with the set-theoretic union.
See for instance \figref{Property} (example 7).
-Now we can present \emph{access relations} which are the formal tools used by
-{\MathQL} semantics to access the {\RDF} graph.
-An access relation is a set of triples $ (r_1, p, r_2) $ where $ r_1 $ and
-$ r_2 $ are strings, $ p $ is a path (encoded as a list of strings).
-Each triple is a sort of ``extended {\RDF} triple'' in the sense that $ r_1 $
-is is a resource for which metadata is provided, $ p $ is a path in the {\RDF}
-graph and $ r_2 $ is the main value of the end-node of the instance of $ p $
-starting from $ r_1 $ (this includes the instances of sub- and super-arcs of
-$ p $ if necessary).
-See for instance \figref{Property} (example 8).
-{\MathQL} does not provide for any built-in access relation so any query
-engine can freely define the access relations that are appropriate with
-respect to the metadata it can access.
-In particular, \secref{Interpreter} describes the access relations implemented
-by the {\HELM} query engine.
-It is worth remarking, as it was already stressed in \cite{GS03, Gui03}, that
-the concept of access relation corresponds to the abstract concept of
-property in the basic {\RDF} data model which draws on well established
-principles from various data representation communities.
-In this sense an {\RDF} property can be thought of either as an attribute of a
-resource (traditional attribute-value pairs model), or as a relation between
-a resource and a value (entity-relationship model).
-This observation leads us to conclude that {\MathQL} is sound and complete
-with respect to querying an abstract {\RDF} metadata model.
-Finally note that access relations are close to {\RDF} entity-relationship
-model, but they do not work if we allow paths with an arbitrary number of
-loops ({\ie} with an arbitrary length) because this would lead to creating
-infinite sets of triples.
-If we want to handle this case, we need to turn these relations into
-multivalued functions.
-\subsection{Textual syntax} \label{Textual}
+\section{Textual syntax} \label{Textual}
-The syntax of grammatical productions resembles BNF and POSIX notation:
+In this section we present {\MathQL}.4 textual syntax using the same notation
+that we adopted in \cite{GS03,Gui03}. In particular the grammatical
+productions we use resemble {\BNF} with some {\POSIX} formalism:
\TT{::=} defines a grammatical production by means of a regular expression.
Regular expressions are made of the following elements
-(here \TT{...} is a placeholder):
-% \item
-% \TT{.} represents any character between U 0020 and U 007F inclusive;
+(\TT{...} is a placeholder):
\TT{`...`} represents any character in a character set;
-\verb+`^ ...`+ represents any character (U+0020 to U+007E) not in a character
+\verb+`^ ...`+ represents any character (U+20 to U+7E) not in a character set;
\TT{"..."} represents a string to be matched verbatim;
-{\MathQL} Expressions can contain quoted constant strings with the syntax of
-\figref {StrTS}.%
-\footnote{Note that the first slash of the \GP{path} is not optional as
-in {\MathQL}.3.}
\begin{footnotesize} \begin{verbatim}
<dec> ::= '0 - 9'
<num> ::= <dec> [ <dec> ]*
\caption{Textual syntax of numbers, strings and paths} \label{StrTS}
-The meaning of the escaped sequences is shown in \figref{EscTS}
-(where $ .... $ is a 4-digit placeholder).
\begin{center} \begin{tabular}{|l|l|c|}
\hline {\bf Escape sequence} & {\bf Unicode character} & {\bf Text} \\
\caption{Textual syntax of escaped characters} \label{EscTS}
+Queries and results can contain quoted constant strings with the syntax of
+\figref {StrTS}%
+{Note that the first slash of the \GP{path} is not optional as in {\MathQL}.3.}
+and the meaning of the escaped sequences is shown in \figref{EscTS} (where
+$ .... $ is a 4-digit placeholder).
{\MathQL} character escaping syntax aims at complying with W3C character model
for the World Wide Web \cite{W3Ca} which recommends a support for standard
Unicode characters (U+0000 to U+FFFF) and escape sequences with start/end
In particular {\MathQL} escape delimiters (backslash and caret) are chosen
-among the {\em unwise} characters for URI references (see \cite{URI}) because
-URI references are the natural content of constant strings and these
-characters should not be so frequent in them.
-Query expressions can contain variables for {\av}'s (production \GP{avar})
-and variables for {\av} sets, {\ie} for query results (production \GP{svar})
-according to the syntax of \figref{VarTS}.%
-\footnote{This syntax resembles the one of programming languages identifiers.}
+among the \emph{unwise} characters for {\URI} references (see \cite{URI})
+because {\URI} references are the natural content of constant strings and
+these characters should not be so frequent in them.
\begin{footnotesize} \begin{verbatim}
<alpha> ::= [ 'A - Z' | 'a - z' | `_` ]+
<id> ::= <alpha> [ <alpha> | <dec> ]*
-<avar> ::= "@" <id>
<svar> ::= "$" <id>
+<evar> ::= "@" <id>
\end{verbatim}\end{footnotesize} %$
\caption{Textual syntax of variables} \label{VarTS}
-The syntax of query expressions (production \GP{query}) is described in
+Queries can also contain \emph{set} variables (production \GP{svar}) and
+\emph{element} variables (production \GP{evar}) according to the syntax of
+\footnote{This syntax resembles the one of programming languages identifiers.}
+A set variable holds an {\av} set, {\ie} a query result, while an element
+variable holds an {\av}.
\begin{footnotesize} \begin{verbatim}
-<qualifier> ::= [ "inverse" ]? [ "sub" | "super" ]? <path>
+<ref> ::= [ "sub" | "super" ]?
+<qualifier> ::= [ "inverse" ]? <ref> <path>
<main> ::= [ "main" <path> ]?
<cons> ::= <path> [ "in" | "match" ] <query>
<istrue> ::= [ "istrue" <cons> [ "," <cons> ]* ]?
<sec> ::= [ "attr" <exp> [ "," <exp> ]* ]?
<opt_args> ::= <main> <istrue> <isfalses> <sec>
<source> ::= [ "pattern" ]? <query>
-<paths> ::= [ <path> [ "," <path> ]* ]?
+<paths> ::= <path> [ "," <path> ]*
<query> ::= "(" <query> ")" | <string> | "[" <xavs> "]"
| "property" <qualifier> <opt_args> "of" <source>
| "let" <svar> "=" <query> "in" <query>
- | <query> ";;" <query> | <svar> | <avar>
- | "ex" <query> | <avar> "." <path>
- | "add" [ "distr" ]? [ <xgroups> | <avar> ] "in" <query>
- | "for" <avar> "in" <query> [ "sup" | "inf" ] <query>
+ | <query> ";;" <query> | <svar> | <evar>
+ | "ex" <query> | <evar> "." <path>
+ | "add" [ "distr" ]? [ <xgroups> | <evar> ] "in" <query>
+ | "for" <evar> "in" <query> [ "sup" | "inf" ] <query>
| "while" <query> [ "sup" | "inf" ] <query>
- | <path> "{" <paths> "}" "{" <queries> "}"
+ | <path> "{" [ <paths> ]? "}" "{" <queries> "}"
| "gen" <path> [ "{" <queries> "}" | "in" <query> ]
<queries> ::= [ <query> [ "," <query> ]* ]?
<xattr> ::= <path> "=" <query>
\caption{Textual syntax of queries} \label{QueryTS}
-The syntax of result expressions (production \GP{avs}) is described in
\begin{footnotesize} \begin{verbatim}
<attr> ::= <path> "=" <string> | "{" <string> [ "," <string> ]* "}"
<group> ::= "{" <attr> [ ";" <attr> ]* "}"
\caption{Textual syntax of results} \label{ResultTS}
-The textual syntax of the language extension provided by the basic library
-is in \figref{BasicTS}.
\begin{footnotesize} \begin{verbatim}
<query> ::= "empty" | "false" | "true"
- | "not" <query>
- | <query> [ "and" | "or" | "xor"| "sub" | "meet" | "eq" | "le"| "lt" ] <query>
- | <query> [ "union" | "intersect" ] <query> | { <queries> }
+ | [ "not" | "count" | "proj" <path> "of" ] <query>
+ | <query> [ "and" | "or" | "xor" ] <query>
+ | <query> [ "sub" | "meet" | "eq" | "le" | "lt" ] <query>
+ | <query> [ "union" | "intersect" | "diff" ] <query>
+ | "{" <queries> "}"
+ | "keep" [ "allbut" ]? [ <paths> "in" ]? <query>
| "if" <query> "then" <query> "else" <query>
- | "select" <avar> "from" <query> "where" <query>
+ | "select" <evar> "from" <query> "where" <query>
\end{verbatim} \end{footnotesize}
-\caption{Textual syntax of basic extension} \label{BasicTS}
+\caption{Textual syntax of the basic extension} \label{BasicTS}
+The core infrastructure of {\MathQL}.4 defines a syntax for queries
+(\figref{QueryTS}, production \GP{query}) and a syntax for results
+(\figref{ResultTS}, production \GP{avs}).
+A syntax extension for the most common functions of the basic library is
+also provided for the user's convenience and for backward compatibility with
+{\MathQL}.3. The syntax extension concerning the functions covered in this
+paper is shown in \figref{BasicTS}.
+Note that this extension makes \GP{avs} an instance of \GP{xavs}.
+\newcommand\EM[1]{\noindent\hbox{\frenchspacing\em #1}}
+\newcommand\TT[1]{\noindent\hbox{\frenchspacing\tt #1}}
+\newcommand\RM[1]{\noindent\hbox{\frenchspacing\rm #1}}
+\newcommand\ie{{\frenchspacing i.e.}}
+\newcommand\oft{\mathrel :}
+\newcommand\app{\mathbin @}
+\newcommand\st{\mathrel |}
+\newcommand\gdlap[2]{\vbox to 0pt {\vskip#2\hbox{#1}\vss}}
+ $\vcenter{\halign{\hss$##$\hss&##\hss\cr#1\crcr}}$}}
+\newcommand\irule[3]{\imain{#1 \iname{#2} #3}}
+\newcommand\Nop{\noindent\hbox to 0pt{\vbox to 1ex{\vfil}\hfil}}
-\def\av{{\frenchspacing a.v.}}
+\newcommand\av{{\frenchspacing a.v.}}
-\def\g{(\G_s, \G_a, \G_g)}
-\def\set#1#2#3{#1[#2 \gets #3]}
+\newcommand\g{(\G_s, \G_e, \G_g)}
+\newcommand\set[3]{#1[#2 \gets #3]}
Here we use a simple type system that includes basic types such as strings and
Booleans, and some type constructors such as product and exponentiation.
$ y \oft Y $ will denote a typing judgement.
-Note that this semantics is not meant as a formal system \emph{per se}, but
-should serve as a reference for implementors.
+This semantics is not meant as a formal system \emph{per se}, but should be a
+reference for implementors.
\subsection {Mathematical background}
-As a mathematical background for the semantics, we take the one presented in
+As background for the semantics, we revise the one presented in
{\Str} denotes the type of strings and its elements are the finite sequences
of Unicode \cite{Unicode} characters.
-Grammatical productions, represented as strings in angle brackets, denote the
-subtype of {\Str} containing the produced sequences of characters.
+Grammatical productions denote the subtype of {\Str} containing the produced
+sequences of characters.
-{\Num} denotes the type of numbers and is the subtype of {\Str} defined by the
-regular expression: \TT{'0 - 9' [ '0 - 9' ]*}.
-In this type, numbers are represented by their decimal expansion.
+{\Num} denotes the type of numbers and is the subtype of {\Str} defined by
+\GP{num}. In this type, numbers are represented by their decimal expansion.
$ \Setof\ Y $ denotes the type of finite sets ({\ie} unordered finite
-sequences without repetitions) over $ Y $.
+sequences without repetitions) over $ Y $ and
$ \Listof\ Y $ denotes the type of lists ({\ie} ordered finite sequences)
over $ Y $.
We will use the notation $ [y_1, \cdots, y_m] $ for the list whose elements
-are $ y_1, \cdots, y_m $.
+are $ y_1, \cdots, y_m $ ($ \{y_1, \cdots, y_m\} $ will denote a set as
-{\Boole}, the type of Boolean values, is defined as
-$ \{\ES, \{("", \ES)\}\} \oft \Setof\ \Setof\ (\Str \times \Setof\ Y) $
+{\Boole}, the type of Boolean values, is defined as the set
+$ \{\ES, \{("", \ES)\}\} $ of type $ \Setof\ \Setof\ (\Str \times \Setof\ Y) $
where the first element stands for \emph{false} (denoted by {\F}) and the
-second element stands for \emph{true} (denoted by {\T}).
+second element stands for \emph{true} (denoted by {\T}).%
+\footnote{This definition allows to treat a Boolean value as an {\av} set.}
$ Y \times Z $ denotes the product of the types $ Y $ and $ Z $ whose elements
are the ordered pairs $ (y, z) $ such that $ y \oft Y $ and $ z \oft Z $.
The notation is also extended to a ternary product.
$ Y \to Z $ denotes the type of functions from $ Y $ to $ Z $ and $ f\ y $
denotes the application of $ f \oft Y \to Z $ to $ y \oft Y $.
Relations over types, such as equality, are seen as functions to {\Boole}.
$ {\meet} \oft (\Setof\ Y) \to (\Setof\ Y) \to \Boole $ (infix)
$ {\sand} \oft (\Setof\ Y) \to (\Setof\ Y) \to (\Setof\ Y) $ (infix)
$ U \sand W $ is inhabited as a primitive notion, {\ie} without mentioning
intersection and equality as for $ U \sand W \neq \ES $, which is equivalent
but may be implemented less efficiently in real cases%
-\footnote{As for the Boolean condition $ \a \lor \b $ which may have a more
-efficient implementation than $ \lnot(\lnot \a \land \lnot \b) $.}.
+\footnote{As for the Boolean condition $ \fa \lor \fb $ which may have a more
+efficient implementation than $ \lnot(\lnot \fa \land \lnot \fb) $.}.
$ U \meet W $ is a natural companion of $ U \sub W $ being its logical dual
(recall that $ U \sub W $ means $ (\lall u \in U)\ u \in W $)
and is already being used successfully in the context of a constructive
({\ie} intuitionistic and predicative) approach to point-free topology
-Sets of couples play a central role in our formalization and in particular we
-will use:
+Sets of couples play a central role in our presentation and we will use:
Moreover $ \set{W}{y}{z} $ is the set obtained from $ W $ removing every
couple whose first component is $ y $ and adding the couple $ (y, z) $.
-The type of this operator is \\
+The type of this operator is
$ (\Setof\ (Y \times Z)) \to Y \to Z \to (\Setof\ (Y \times Z)) $.
--- /dev/null
+\subsection {The basic library}
+In this section we present the functions provided by the {\MathQL}.4 basic
+library. Describing the whole library would require an amount of space that
+goes beyond the limits of this paper so we include here just a selection of
+the available functions (the ones for which we gave the syntax extension in
+The function below are grouped by their arity.
+\textbf{The predefined {\av} sets.}
+The functions \TT{/"empty"}, \TT{/"false"} and \newline \TT{/"true"} take no
+path arguments and no set arguments.
+\irule{\Nop}{}{\Fun\ \TT{/"empty"}\ [\,]\ [\,]\ \G \equiv \F} \spc
+\irule{\Nop}{}{\Fun\ \TT{/"false"}\ [\,]\ [\,]\ \G \equiv \F}
+\irule{\Nop}{}{\Fun\ \TT{/"true"}\ [\,]\ [\,]\ \G \equiv \T}
+Moreover ``\TT{empty}'' rewrites to ``\verb+fun /"empty" {} {}+'',
+``\TT{false}'' rewrites to ``\verb+fun /"false" {} {}+'' and
+``\TT{true}'' rewrites to ``\verb+fun /"true" {} {}+''.
+\textbf{Boolean negation and size.}
+The functions \TT{/"not"} and \TT{/"count"} take no path arguments and one set
+Here Rule 1 overrides rule 2.
+\irule{(\G, x) \daq \F}{1}{\Fun\ \TT{/"not"}\ [\,]\ [x]\ \G \equiv \T} \spc
+\irule{(\G, x) \daq S}{2}{\Fun\ \TT{/"not"}\ [\,]\ [x]\ \G \equiv \F}
+\irule{(\G, x) \daq S}{}{\Fun\ \TT{/"count"}\ [\,]\ [x]\ \G \equiv \#\ S}
+Moreover ``\TT{not} x'' rewrites to ``\verb+fun /"not" {} {+x\verb+}+''
+and ``\TT{count} x'' rewrites to ``\verb+fun /"count" {} {+x\verb+}+''.
+\textbf{Boolean xor, set-theoretic and numerical tests, difference.}
+\TT{/"xor"}, \TT{/"sub"}, \TT{/"meet"}, \TT{/"eq"}, \TT{/"le"}, \TT{/"lt"} and
+\TT{/"diff"} take no path arguments and two set arguments.
+The rule with the lowest number is applied first.
+\irule{(\G, x_1) \daq \F \spc (\G, x_2) \daq \F}{1}
+ {\Fun\ \TT{/"xor"}\ [\,]\ [x_1, x_2]\ \G \equiv \T} \spc
+\irule{(\G, x_1) \daq S_1 \spc (\G, x_2) \daq \F}{2}
+ {\Fun\ \TT{/"xor"}\ [\,]\ [x_1, x_2]\ \G \equiv S_1}
+\irule{(\G, x_1) \daq \F \spc (\G, x_2) \daq S_2}{3}
+ {\Fun\ \TT{/"xor"}\ [\,]\ [x_1, x_2]\ \G \equiv S_2} \spc
+\irule{(\G, x_1) \daq S_1 \spc (\G, x_2) \daq S_2}{4}
+ {\Fun\ \TT{/"xor"}\ [\,]\ [x_1, x_2]\ \G \equiv \F}
+\irule{(\G, x_1) \daq S_1 \spc (\G, x_2) \daq S_2}{}
+ {\Fun\ \TT{/"sub"}\ [\,]\ [x_1, x_2]\ \G \equiv (\Fsts\ S_1 \sub \Fsts\ S_2)}
+\irule{(\G, x_1) \daq S_1 \spc (\G, x_2) \daq S_2}{}
+ {\Fun\ \TT{/"meet"}\ [\,]\ [x_1, x_2]\ \G \equiv (\Fsts\ S_1 \meet \Fsts\ S_2)}
+\irule{(\G, x_1) \daq S_1 \spc (\G, x_2) \daq S_2}{}
+ {\Fun\ \TT{/"eq"}\ [\,]\ [x_1, x_2]\ \G \equiv (\Fsts\ S_1 = \Fsts\ S_2)}
+\irule{(\G, x_1) \daq \{(r_1, A_1)\} \spc (\G, x_2) \daq \{(r_2, A_2)\}}{}
+ {\Fun\ \TT{/"le"}\ [\,]\ [x_1, x_2]\ \G \equiv (r_1 \le r_2)}
+\irule{(\G, x_1) \daq \{(r_1, A_1)\} \spc (\G, x_2) \daq \{(r_2, A_2)\}}{}
+ {\Fun\ \TT{/"lt"}\ [\,]\ [x_1, x_2]\ \G \equiv (r_1 < r_2)}
+\irule{(\G, x_1) \daq S_1 \spc (\G, x_2) \daq S_2}{}
+ {\Fun\ \TT{/"eq"}\ [\,]\ [x_1, x_2]\ \G \equiv (S_1 \sdiff S_2)}
+where $\sdiff$ is a helper function two rewrite rules:
+\begin{center} \begin{tabular}{lrll}
+5 &
+$ (S_1 \sdor \{(r, A_1)\}) \sdiff (S_2 \sdor \{(r, A_2)\}) $ & rewrites to &
+$ S_1 \sdiff S_2 $ \\
+6 & $ S_1 \sdiff S_2 $ & rewrites to & $ S_1 $
+\end{tabular} \end{center}
+``x \TT{xor} y'' rewrites to ``\verb+fun /"xor" {} {+x\TT{,}y\verb+}+'',
+``x \TT{sub} y'' rewrites to \newline ``\verb+fun /"sub" {} {+x\TT{,}y\verb+}+'',
+``x \TT{meet} y'' rewrites to ``\verb+fun /"meet" {} {+x\TT{,}y\verb+}+'',
+``x \TT{eq} y'' rewrites to ``\verb+fun /"eq" {} {+x\TT{,}y\verb+}+'',
+``x \TT{le} y'' rewrites to \newline ``\verb+fun /"le" {} {+x\TT{,}y\verb+}+'',
+``x \TT{lt} y'' rewrites to ``\verb+fun /"lt" {} {+x\TT{,}y\verb+}+'' and
+``x \TT{diff} y'' rewrites to ``\verb+fun /"diff" {} {+x\TT{,}y\verb+}+''
+\textbf{Conditional operator and standard \emph{select} clause}.
+\TT{/"if"} takes no path arguments and three set arguments.
+The usual rule overriding policy applies.
+\irule{(\G, x_1) \daq \F \spc (\G, x_3) \daq S_3}{1}
+ {\Fun\ \TT{/"if"}\ [\,]\ [x_1, x_2, x_2]\ \G \equiv S_3} \spc
+\irule{(\G, x_1) \daq S_1 \spc (\G, x_2) \daq S_2}{2}
+ {\Fun\ \TT{/"if"}\ [\,]\ [x_1, x_2, x_2]\ \G \equiv S_2}
+``\TT{if} x \TT{then} y \TT{else} z'' rewrites to
+``\verb+fun /"if" {} {+x\TT{,}y\TT{,}z\verb+}+'' and
+``\TT{select} i \TT{from} x \TT{where} y'' rewrites to
+``\TT{for} i \TT{in} x \TT{sup} \TT{if} y \TT{then} i \TT{else} \TT{empty}''.
+\textbf{Intersection without attribute distribution.}
+\TT{/"intersect"} takes no path arguments and any positive number of set
+\irule{(\G, x_1) \daq S_1 \spc \cdots \spc (\G, x_m) \daq S_m}{}
+ {\Fun\ \TT{/"intersect"}\ [\,]\ [x_1, \cdots, x_m]\ \G \equiv
+ (S_1 \sprod \cdots \sprod S_m)}
+As usual
+``x \TT{intersect} y'' rewrites to ``\verb+fun /"intersect" {} {+x\TT{,}y\verb+}+''.
+\textbf{Union without attribute distribution, Boolean conjunction and
+\TT{/"union"}, \TT{/"and"} and \TT{/"or"} take no path arguments and any
+number of set arguments.
+The usual rule overriding policy applies.
+\irule{\Nop}{}{\Fun\ \TT{/"union"}\ [\,]\ [\,]\ \G \equiv \F} \spc
+\irule{(\G, x_1) \daq S_1 \spc \cdots \spc (\G, x_m) \daq S_m}{}
+ {\Fun\ \TT{/"union"}\ [\,]\ [x_1, \cdots, x_m]\ \G \equiv
+ (S_1 \ssum \cdots \ssum S_m)}
+\irule{\Nop}{}{\Fun\ \TT{/"and"}\ [\,]\ [\,]\ \G \equiv \T} \spc
+\irule{\Nop}{}{\Fun\ \TT{/"or"}\ [\,]\ [\,]\ \G \equiv \F}
+\irule{\Fun\ \TT{/"and"}\ [\,]\ l\ \G \equiv \F}{1}
+ {\Fun\ \TT{/"and"}\ [\,]\ (l \app [x])\ \G \equiv \F} \spc
+\irule{\Fun\ \TT{/"or"}\ [\,]\ l\ \G \equiv \F \spc (\G, x) \daq S}{3}
+ {\Fun\ \TT{/"or"}\ [\,]\ (l \app [x])\ \G \equiv S}
+\irule{\Fun\ \TT{/"and"}\ [\,]\ l\ \G \equiv S_l \spc (\G, x) \daq S}{2}
+ {\Fun\ \TT{/"and"}\ [\,]\ (l \app [x])\ \G \equiv S} \spc
+\irule{\Fun\ \TT{/"or"}\ [\,]\ l\ \G \equiv S_l}{4}
+ {\Fun\ \TT{/"or"}\ [\,]\ (l \app [x])\ \G \equiv S_l}
+``x \TT{and} y'' rewrites to ``\verb+fun /"and" {} {+x\TT{,}y\verb+}+'',
+``x \TT{or} y'' rewrites to \newline ``\verb+fun /"or" {} {+x\TT{,}y\verb+}+'',
+``x \TT{union} y'' rewrites to \kern-1.1pt ``\verb+fun /"union" {} {+x\TT{,}y\verb+}+''
+and ``\verb+{+x$_1$\TT{,}$\cdots$\TT{,}x$_m$\verb+}+'' rewrites to
+``\verb+fun /"union" {} {+x$_1$\TT{,}$\cdots$\TT{,}x$_m$\verb+}+''.
+\TT{/"proj"} takes one path argument and one set argument.
+\irule {p \oft \TT{<path>} \spc
+ (\G, x) \daq \{ (r_1, A_1), \cdots, (r_m, A_m) \}}{}
+ {\Fun\ \TT{/"proj"}\ [p]\ [x]\ \G \equiv
+ \Head\ (\Proj\ (\Name\ p)\ A_1 \sor \cdots \sor \Proj\ (\Name\ p)\ A_m)}
+\begin{center} \begin{tabular}{rll}
+$ \Proj\ p\ \{G_1, \cdots, G_n\} $ & rewrites to &
+$ \get{G_1}{p} \sor \cdots \sor \get{G_n}{p} $ \\
+$ \Head\ \{s_1, \cdots, s_k\} $ & rewrites to & $ \{ (s_1, \ES), \cdots, (s_k, \ES) \} $
+\end{tabular} \end{center}
+where, for each $ 1 \le j \le n $, $ \get{G_j}{p} $ is $ \ES $ if $ p $ is not
+defined in $ G_j $.
+``\TT{proj} p \TT{of} x'' rewrites to ``\verb+fun /"proj" {+p\verb+} {+x\verb+}+''.
+The functions \TT{/"keep"/"these"} and \TT{/"keep"/"allbut"} take any number
+of path arguments and one set argument.
+In the following rules if $ l $ is $ [p_1, \cdots, p_m] $ then
+$ W $ is $ \{\Name\ p_1, \cdots, \Name\ p_m\} $ Moreover {\Keep} and $\Keep\p$
+are helper functions and the usual rule overriding policy applies.
+\irule{l \oft \Listof\ \TT{<path>} \spc (\G, x) \daq S}{}
+ {\Fun\ \TT{/"keep"/"these"}\ l\ [x]\ \G \equiv
+ \{ (r, \bigsor \{ \Keep\ \T\ W\ G \st G \in A \}) \st (r, A) \in S \}}
+\irule{l \oft \Listof\ \TT{<path>} \spc (\G, x) \daq S}{}
+ {\Fun\ \TT{/"keep"/"allbut"}\ l\ [x]\ \G \equiv
+ \{ (r, \bigsor \{ \Keep\ \F\ W\ G \st G \in A \}) \st (r, A) \in S \}}
+\irule{\Keep\p\ b\ W\ G\ \RM{rewrites to}\ \ES}{1}
+{\Keep\ b\ W\ G\ \RM{rewrites to}\ \ES} \spc
+\irule{\Keep\p\ b\ W\ G\ \RM{rewrites to}\ G\p}{2}
+{\Keep\ b\ W\ G\ \RM{rewrites to}\ \{G\p\}}
+\begin{center} \begin{tabular}{rll}
+$ \Keep\p\ \T\ W\ G $ & rewrites to & $ \{ (p, V) \in G \st p \in W \} $ \\
+$ \Keep\p\ \F\ W\ G $ & rewrites to & $ \{ (p, V) \in G \st p \notin W \} $
+\end{tabular} \end{center}
+``\TT{keep} p$_1$\TT{,}$\cdots$\TT{,}p$_m$ \TT{in} x'' rewrites to
+``\TT{fun /"keep"/"these"} \verb+{+p$_1$\TT{,}$\cdots$\TT{,}p$_m$\verb+}+ \verb+{+x\verb+}+'',
+``\TT{keep} x'' rewrites to ``\TT{fun /"keep"/"these"} \verb+{}+ \verb+{+x\verb+}+'',
+``\TT{keep allbut} p$_1$\TT{,}$\cdots$\TT{,}p$_m$ \TT{in} x'' rewrites to
+``\TT{fun /"keep"/"allbut"} \verb+{+p$_1$\TT{,}$\cdots$\TT{,}p$_m$\verb+}+ \verb+{+x\verb+}+''
+``\TT{keep allbut} x'' rewrites to ``\TT{fun /"keep"/"allbut"} \verb+{}+ \verb+{+x\verb+}+''.
+Note that ``\TT{keep allbut} x'' gives the same result as ``x'' does.
-\subsection {The core language}
+\subsection {The core language} \label{OCore}
-Wih the above background we are able to type the main objects needed in the
+With the above background we are able to type the main objects needed in the
-A path $ s $ is a list of strings therefore its type is
+A path $ p $ is a list of strings therefore its type is
$ T_{0a} = \Listof\ \Str $.
-A multiple string value $ V $ is an object of type $ T_{0b} = \Setof\ \Str $.
+The attribute contents $ V $ are an object of type $ T_{0b} = \Setof\ \Str $.
-A attribute group $ G $ is an association set connecting the attribute names
-to their values, therefore its type is
+An attribute group $ G $ is an association set connecting the attribute names
+to their contents, therefore its type is
$ T_1 = \Setof\ (T_{0a} \times T_{0b}) $.
-A subject string $ r $ is an object of type $ \Str $.
+A head string $ r $ is an object of type $ \Str $.
A set $ A $ of attribute groups is an object of type $ T_2 = \Setof\ T_1 $.
-An {\av} is a subject string with its attribute groups, so its type is
+An {\av}, {\ie} a head string with its attribute groups, has type
$ T_3 = \Str \times T_2 $.
When a constant string appearing in a {\MathQL} expression is unquoted, the
surrounding double quotes are deleted and each escaped sequence is translated
-according \figref{EscTS}.
+according to \figref{EscTS}.
This operation is formally performed by the function
$ \Unquote $ of type $ \Str \to \Str $.
Moreover $ \Name \oft \GP{path} \to T_{0a} $ is a helper function that
relation $ \daq $ that evaluates a query to an {\av} set.
These expressions are evaluated in a context $ \G = \g $
which is a triple of association sets that connect
-svar's to {\av} sets, avar's to {\av}'s and avar's to attribute groups.
+set variables to {\av} sets, element variables to {\av}'s and element
+variables to attribute groups.
Therefore the type $ K $ of the context $ \G $ is:
\begin{footnotesize} \begin{center}
-\Setof\ (\GP{svar} \times T_4) \times
-\Setof\ (\GP{avar} \times T_3)\ \times % $ \\ $ \times\
-\Setof\ (\GP{avar} \times T_1)
+\Setof\ (\GP{svar} \times T_4) \times
+\Setof\ (\GP{evar} \times T_3) \times % $ \\ $ \times\
+\Setof\ (\GP{evar} \times T_1)
\end{center} \end{footnotesize}
\end{tabular} \end{center}
-The context components $ \G_s $ and $ \G_a $ are used to store the contents of
+The context components $ \G_s $ and $ \G_e $ are used to store the contents of
variables, while $ \G_g $ is used by the \TT{ex} operator to be presented
-The first \GP{query} expressions include explicit {\av} sets and syntactic
+The first group of \GP{query} expressions include the representation of
+explicit {\av} sets and the syntactic grouping facility:
The syntactic grouping is obtained enclosing a \GP{query} between \TT{(}
and \TT{)}.
-An explicit {\av} set can be represented by a single string, which is
+An explicit {\av} set can be represented either by a single string, which is
converted into a single {\av} with no attributes, or by a \GP{xavs}
(extended {\av} set) expression enclosed between \TT{[} and \TT{]}.
Such an expression describes all the components of an {\av} set and is
\irule{x_1, \cdots, x_m \in \GP{xav} \spc
- (\G, TT{[} x_1 \TT{]}) \daq S_1 \spc \cdots \spc (\G, \TT{[} x_m \TT{]}) \daq S_m}{}
- {(\G, \TT{[} x_1 \TT{;} \cdots \TT{;} x_m \TT{]}) \daq S_1 \sum \cdots \sum S_m}
+ (\G, \TT{[} x_1 \TT{]}) \daq S_1 \spc \cdots \spc (\G, \TT{[} x_m \TT{]}) \daq S_m}{}
+ {(\G, \TT{[} x_1 \TT{;} \cdots \TT{;} x_m \TT{]}) \daq S_1 \ssum \cdots \ssum S_m}
-\irule{q \in \GP{string} \spc g_1, \cdots, g_m \in \GP{xgroup} \spc
+\irule{q \in \GP{string} \spc g_1, \cdots, g_m \in \GP{xgroup} \icr
(\G, \TT{[} q\ \TT{attr}\ g_1 \TT{]}) \daq S_1 \spc \cdots \spc
(\G, \TT{[} q\ \TT{attr}\ g_m \TT{]}) \daq S_m}{}
- {(\G, \TT{[} q\ \TT{attr}\ g_1 \TT{,} \cdots \TT{,} g_m \TT{]}) \daq S_1 \sum \cdots \sum S_m}
+ {(\G, \TT{[} q\ \TT{attr}\ g_1 \TT{,} \cdots \TT{,} g_m \TT{]}) \daq S_1 \ssum \cdots \ssum S_m}
-\irule{q \in \GP{string} \spc a_1, \cdots, a_m \in \GP{xatr} \spc
+\irule{q \in \GP{string} \spc a_1, \cdots, a_m \in \GP{xatr} \icr
(\G, \TT{[} q\ \TT{attr}\ \{ a_1 \} \TT{]}) \daq S_1 \spc \cdots \spc
(\G, \TT{[} q\ \TT{attr}\ \{ a_m \} \TT{]}) \daq S_m}{}
{(\G, \TT{[} q\ \TT{attr}\ \{ a_1 \TT{;} \cdots \TT{;} a_m \} \TT{]}) \daq S_1 \dsum \cdots \dsum S_m}
-$ \dsum $ and $ \sum $ are helper functions describing the two union operations
-on {\av} sets: with and without attribute distribution respectively.
-$ \dsum $ and $ \sum $ have two rewrite rules each.
+$ \dsum $ and $ \ssum $ are helper functions describing the two union
+operations on {\av} sets: with and without attribute distribution respectively.
\begin{center} \begin{tabular}{lrll}
1a &
-$ (S_1 \sdor \{(r, A_1)\}) \sum (S_2 \sdor \{(r, A_2)\}) $ & rewrites to &
-$ (S_1 \sum S_2) \sor \{(r, A_1 \sor A_2)\} $ \\
-1b & $ S_1 \sum S_2 $ & rewrites to & $ S_1 \sor S_2 $ \\
+$ (S_1 \sdor \{(r, A_1)\}) \ssum (S_2 \sdor \{(r, A_2)\}) $ & rewrites to &
+$ (S_1 \ssum S_2) \sor \{(r, A_1 \sor A_2)\} $ \\
+1b & $ S_1 \ssum S_2 $ & rewrites to & $ S_1 \sor S_2 $ \\
2a &
$ (S_1 \sdor \{(r, A_1)\}) \dsum (S_2 \sdor \{(r, A_2)\}) $ & rewrites to &
$ (S_1 \dsum S_2) \sor \{(r, A_1 \distr A_2)\} $ \\
\end{tabular} \end{center}
-Rules 1a, 2a override 1b, 2b respectively and
-$ A_1 \distr A_2 = \{G_1 \sum G_2 \st G_1 \in A_1, G_2 \in A_2\} $.
+Rules 1a, 2a override 1b, 2b and
+$ A_1 \distr A_2 = \{G_1 \ssum G_2 \st G_1 \in A_1, G_2 \in A_2\} $.
-The semantics of \TT{property} operator is described below.
+The semantics of the \TT{property} operator is described below.
In the following rule,
-$s$ is ``$ \TT{property}\ h\ p_1\ \TT{main}\ p_2\ \RM{attr}\ e_1, \cdots,
+$s$ is ``$ \TT{property}\ h\ p_1\ \TT{main}\ p_2\ \TT{attr}\ e_1, \cdots,
e_m\ \TT{in}\ k\ x $'', $P$ is $ \Property\ h $ and
$A_2$ is $ \{ \Exp\ P\ p_1\ r_1\ \{e_1, \cdots, e_m\}\} $:
-{h \oft \GP{refine} \spc p_1, p_2 \oft \GP{path} \spc
+{h \oft \GP{ref} \spc p_1, p_2 \oft \GP{path} \spc
e_1, \cdots, e_m \oft \GP{exp} \spc k \in \TT{["pattern"]?} \spc
(\G, x) \daq S
When the \TT{main} clause is not present, we assume $ p_2 = \TT{/} $.
Here $ \Property\ h $ gives the appropriate access relation according to
-the $ h $ flag (this is the primitive function that inspects the {\RDF} graph,
-see \subsecref{HighAccess}).
+the $ h $ flag (this is the primitive function that inspects the {\RDF}
+graph, see \subsecref{HighAccess}).
$ \Src\ k\ P\ V $ is a helper function giving the source set
according to the $ k $ flag. $ \Src $ is based on $ \Match $, the helper
-function handling POSIX regular expressions. Formally:
+function handling {\POSIX} regular expressions.
+Here $ \Pattern\ W\ s $ is the primitive function returning the subset of
+$ W \oft \Setof\ \Str $ whose element match the {\POSIX} 1003.2-1992%
+\footnote{In {\POSIX} 1003.1-2001:
+regular expression $ \verb+"^"+ \app s \app \TT{"\$"} $.
\begin{center} \begin{tabular}{rll}
\end{tabular} \end{center}
-Here $ \Pattern\ W\ s $ is the primitive function returning the subset of
-$ W \oft \Setof\ \Str $ whose element match the POSIX 1003.2-1992%
-\footnote{Included in POSIX 1003.1-2001:
-regular expression $ \verb+"^"+ \app s \app \TT{"\$"} $.
-$ \Exp\ P\ \p_1\ r_1\ E $ is the helper function that builds the group of
+$ \Exp\ P\ p_1\ r_1\ E $ is the helper function that builds the group of
attributes specified in the \TT{attr} clause.
-$ \Exp $ is based on $ \Exp\p $ which handles a single attribute. Formally:
+$ \Exp $ is based on $ \Exp\p $ which handles a single attribute. Formally,
+if $ p, p\p \oft \GP{path} $ and $ E \oft \Setof\ \GP{exp} $:
-\begin{center} \begin{tabular}{rlll}
+\begin{center} \begin{tabular}{rll}
$ f\ P\ r_1\ p_1\ p $ & rewrites to &
-$ \{ r_2 \st (r_1, p_1 \app (\Name\ p), r_2) \in P \} $ &
-with $ p \oft \GP{path} $ \\
+$ \{ r_2 \st (r_1, p_1 \app (\Name\ p), r_2) \in P \} $ \\
$ \Exp\p\ P\ r_1\ p_1\ p $ & rewrites to &
-$ \{ (\Name\ p, f\ P\ r_1\ p_1\ p) \} $ &
-with $ p \oft \GP{path} $ \\
+$ \{ (\Name\ p, f\ P\ r_1\ p_1\ p) \} $ \\
$ \Exp\p\ P\ r_1\ p_1\ (p\ \TT{as}\ p\p) $ & rewrites to &
-$ \{ (\Name\ p\p, f\ P\ r_1\ p_1\ p) \} $ &
-with $ p, p\p \oft \GP{path} $ \\
+$ \{ (\Name\ p\p, f\ P\ r_1\ p_1\ p) \} $ \\
$ \Exp\ P\ r_1\ p_1\ E $ & rewrites to &
-$ \bigsum \{ \Exp\p\ P\ r_1\ p_1\ e \st e \in E \} $ &
-with $ E \oft \Setof\ \GP{exp} $
+$ \bigsum \{ \Exp\p\ P\ r_1\ p_1\ e \st e \in E \} $ \\
\end{tabular} \end{center}
For each clause ``\TT{isfalse} $ c_1, \cdots, c_n $'' the set $ P $
-must be replaced with
+must be replaced with \newline
$ \{ (r_1, p, r_2) \in P \st \lnot (\Istrue\ P\ r_1\ p_1\ C) \} $
(using the above notation).
Note that these substitutions and the former must be composed if necessary.
-The second group of \GP{query} expressions includes the context manipulation
+The second group of \GP{query} expressions allows the context manipulation:
\begin{footnotesize} \begin{center}
\irule{i \oft \GP{svar}}{}{(\g, i) \daq \get{\G_s}{i}} \spc
-\irule{i \oft \GP{avar}}{}{(\g, i) \daq \{\get{\G_a}{i}\}}
+\irule{i \oft \GP{evar}}{}{(\g, i) \daq \{\get{\G_e}{i}\}}
\end{center} \end{footnotesize}
-$ \get{\G_s}{i} $ and $ \{\get{\G_a}{i}\} $ mean $ \ES $ if $ i $ is not defined.
+$ \get{\G_s}{i} $ and $ \{\get{\G_e}{i}\} $ mean $ \ES $ if $ i $ is not defined.
-The \TT{let} operator assigns an {\av} set variable (svar):
+The \TT{let} operator assigns a set variable (\GP{svar}):
\irule{i \oft \GP{svar} \spc (\G_1, x_1) \daq (\g, S_1) \spc
- ((\set{\G_s}{i}{S_1}, \G_a, \G_g), x_2) \daq (\G_2, S_2)}
+ ((\set{\G_s}{i}{S_1}, \G_e, \G_g), x_2) \daq (\G_2, S_2)}
{}{(\G_1, \TT{let}\ i\ \TT{=}\ x_1\ \TT{in}\ x_2) \daq (\G_2, S_2)}
The sequential composition operator \TT{;;} has the semantics of a \TT{let}
-introducing a fresh variable, so ``$ x_1\ \TT{;;}\ x_2 $'' revrites
+introducing a fresh variable, so ``$ x_1\ \TT{;;}\ x_2 $'' rewrites
to ``$ \TT{let}\ i\ \TT{=}\ x_1\ \TT{in}\ x_2 $'' where $i$ does not occur in
The \TT{ex} and ``dot'' operators provide a way to read the attributes stored
-in avar's.
+in element variables.
The \TT{ex} (exists) operator gives access to the groups of attributes
-associated to the {\av}'s in the $ \G_a $ part of the context and does
+associated to the {\av}'s in the $ \G_e $ part of the context and does
this by loading its $ \G_g $ part, which is used by the ``dot'' operator
described below.
\TT{ex} is true if the query following it is successful for at least one
-pool of attribute groups, one for each {\av} in the $ \G_a $ part of the
+pool of attribute groups, one for each {\av} in the $ \G_e $ part of the
context. Formally we have the rules:
-\irule{(\lall \D_g \in \All\ \G_a)\ ((\G_s, \G_a, \G_g + \D_g), y) \daq \F}
+\irule{(\lall \D_g \in \All\ \G_e)\ ((\G_s, \G_e, \G_g + \D_g), y) \daq \F}
{1}{(\G, \TT{ex}\ y) \daq \F} \spc
\irule{\Nop}{2}{(\G, \TT{ex}\ y) \daq \T} \spc
-\irule {i \oft \GP{avar} \spc p \oft \GP{path} \spc \get{\get{\G_g}{i}}{\Name\ p} = \{s_1, \cdots, s_m\}}{}
+\irule {i \oft \GP{evar} \spc p \oft \GP{path} \spc \get{\get{\G_g}{i}}{\Name\ p} = \{s_1, \cdots, s_m\}}{}
{(\G, i\TT{.}p) \daq \{(s_1, \ES), \cdots, (s_m, \ES)\}}
\footnote{$\D_g$ has the type of $ \G_g $.}
-$ \All\ \G_a = \{\D_g \st \get{\D_g}{i} = G\ \RM{iff}\ G \in \Snd\ \get{\G_a}{i} \} $,
+$ \All\ \G_e = \{\D_g \st \get{\D_g}{i} = G\ \RM{iff}\ G \in \Snd\ \get{\G_e}{i} \} $,
and $ \G = \g $.
Moreover $ \get{\get{\G_g}{i}}{\Name\ p} $ means $ \ES $
if $ i $ or $ \Name\ p $ are not defined where appropriate.
Here the first rule has higher precedence than the second one does.
-The third group of \GP{query} expressions includes the {\av} set manipulation
+The third group of \GP{query} expressions allows the {\av} set manipulation:
The \TT{add} operator adds a given set of attribute groups to the {\av}'s
of an {\av} set using a union with or without attribute distribution
-according to the \TT{distr} flag.
+according to the setting of the \TT{distr} flag.
-{h \in \TT{["distr"]?} \spc a \in \GP{xgroups} \spc
+{h \in \TT{["distr"]?} \spc a \in \GP{xgroups} \icr
(\G, \TT{[} ""\ \TT{attr}\ a \TT{]}) \daq \{("", A)\} \spc
(\G, x) \daq \{(r_1, A_1), \cdots, (r_m, A_m)\}}{}
{(\G, \TT{add}\ a\ \TT{in}\ x) \daq \{(r_1, A_1 \jolly A), \cdots, (r_m, A_m \jolly A)\}}
-{h \in \TT{["distr"]?} \spc i \in \GP{avar} \spc
+{h \in \TT{["distr"]?} \spc i \in \GP{evar} \spc
(\g, x) \daq \{(r_1, A_1), \cdots, (r_m, A_m)\}}{}
-{(\g, \TT{add}\ i\ \TT{in}\ x) \daq \{(r_1, A_1 \jolly \Snd\ \get{\G_a}{i}), \cdots, (r_m, A_m \jolly \Snd\ \get{\G_a}{i})\}}
+{(\g, \TT{add}\ i\ \TT{in}\ x) \daq \{(r_1, A_1 \jolly \Snd\ \get{\G_e}{i}), \cdots, (r_m, A_m \jolly \Snd\ \get{\G_e}{i})\}}
Where $ \jolly_{\tt""} = \sor $ and $ \jolly_{\tt"distr"} = \distr $.
-Moreover $ \Snd\ \get{\G_a}{i} = \ES $ if $i$ is not defined.
+Moreover $ \Snd\ \get{\G_e}{i} = \ES $ if $i$ is not defined.
-The semantics of the \TT{for} operator is given in terms of the {\For} helper
+The semantics of the \TT{for} operator is given using the {\For} helper
-\irule{i \oft \GP{avar} \spc (\G, x_1) \daq (\G_1, S_1) \spc h \in \TT{["sup"|"inf"]}}
-{}{(\G, \TT{for}\ i\ \TT{in}\ x_1\ h\ x_2) \daq \For\ h\ \G_1\ i\ x_2\ S_1} \spc
-\irule{i \oft \GP{avar} \spc x_2 \oft \GP{query}}{}
+\irule{i \oft \GP{evar} \spc (\G, x_1) \daq (\G_1, S_1) \spc h \in \TT{["sup"|"inf"]}}
+{}{(\G, \TT{for}\ i\ \TT{in}\ x_1\ h\ x_2) \daq \For\ h\ \G_1\ i\ x_2\ S_1}
+\irule{i \oft \GP{evar} \spc x_2 \oft \GP{query}}{}
{\For\ h\ \G\ i\ x_2\ \ES\ \RM{rewrites to}\ (\G, \ES)}
-\irule{i \oft \GP{avar} \spc ((\G_s, \set{\G_a}{i}{R}, \G_g), x_2) \daq (\G_2, S_2)}
+\irule{i \oft \GP{evar} \spc ((\G_s, \set{\G_e}{i}{R}, \G_g), x_2) \daq (\G_2, S_2)}
{}{\For\ h\ \G\ i\ x_2\ (S_1 \sdor \{R\})\ \RM{rewrites to}\
(\G_2 ,(\Snd\ (\For\ h\ \G_2\ i\ x_2\ S_1)) \jolly_h S_2)}
-Here we have $ R \oft T_3 $, $ \G = \g $, $ \jolly_{\tt"sup"} = \sum $ and
-$ \jolly_{\tt"inf"} = \prod $.
+Here we have $ R \oft T_3 $, $ \G = \g $, $ \jolly_{\tt"sup"} = \ssum $ and
+$ \jolly_{\tt"inf"} = \sprod $.
-$ \dprod $ and $ \prod $ are helper functions describing the two intersection
+$ \dprod $ and $ \sprod $ are helper functions describing the two intersection
operations on {\av} sets: with and without attribute distribution respectively.
-They are dual to $ \dsum $ and $ \sum $. $ \dprod $ does not appear in this
-version of {\MathQL} but was used in the erlier versions
-\cite{Lor02, GS03, Gui03}.
+They are dual to $ \dsum $ and $ \ssum $. $ \dprod $ does not appear in this
+version of {\MathQL} but was used in the earlier versions
\begin{center} \begin{tabular}{lrll}
1a &
-$ (S_1 \sdor \{(r, A_1)\}) \prod (S_2 \sdor \{(r, A_2)\}) $ & rewrites to &
-$ (S_1 \prod S_2) \sor \{(r, A_1 \sor A_2)\} $ \\
-1b & $ S_1 \prod S_2 $ & rewrites to & $ \ES $ \\
+$ (S_1 \sdor \{(r, A_1)\}) \sprod (S_2 \sdor \{(r, A_2)\}) $ & rewrites to &
+$ (S_1 \sprod S_2) \sor \{(r, A_1 \sor A_2)\} $ \\
+1b & $ S_1 \sprod S_2 $ & rewrites to & $ \ES $ \\
2a &
$ (S_1 \sdor \{(r, A_1)\}) \dprod (S_2 \sdor \{(r, A_2)\}) $ & rewrites to &
$ (S_1 \dprod S_2) \sor \{(r, A_1 \distr A_2)\} $ \\
\end{tabular} \end{center}
-As for $ \sum $ and $ \dsum $, rules 1a, 2a override rules 1b, 2b respectively.
+As for $ \ssum $ and $ \dsum $, rules 1a, 2a override rules 1b, 2b respectively.
The semantics of the \TT{while} operator is given by the rules below:
{h \in \TT{["sup"|"inf"]} \spc (\G, x_1) \daq (\G_1, S_1) \spc
- (\G_1, x_2) \daq (\G_2, S_2) \spc
+ (\G_1, x_2) \daq (\G_2, S_2) \icr
(\G_2, \TT{while}\ x_1\ h\ x_2) \daq (\G_3, S)}{2}
{(\G, \TT{while}\ x_1\ h\ x_2) \daq (\G_3, S_2 \jolly_h S)}
-Again $ \jolly_{\tt"sup"} = \sum $ and $ \jolly_{\tt"inf"} = \prod $.
-Moreover rule 1 takes precedence over rule 2.
+Again $ \jolly_{\tt"sup"} = \ssum $ and $ \jolly_{\tt"inf"} = \sprod $.
+Rule 1 takes precedence over rule 2.
-The forth group of \GP{query} constructions make {\MathQL} an extensible
+The forth group of \GP{query} constructions makes {\MathQL} extensible.
to invoke an undefined function.
-The \TT{gen} construction invokes an external function returning a \GP{query}
+The \TT{gen} construction invokes an external function returning a \GP{query}.
The function is identified by a \GP{path} and its arguments are a set of
\GP{query}'s. It is a mistake to invoke a function with the wrong number of
\GP{query}'s as input (each particular function defines this number
\GP{query} $ is the primitive function performing the low level invocation.
The core language does not include any external function of this kind and it
is a mistake to invoke an undefined function.
-The construction ``\TT{gen} p \TT{in} x'' rewrites to ``\TT{gen} p \{x\}''
-for the user's convenience.
+The construction ``\TT{gen} p \TT{in} x'' rewrites to ``\TT{gen} p \{x\}''.
-An \GP{avs} expression ({\ie} the explicit representation of an {\av} set that
-can denote a query result) is evaluated to an {\av} set according to the
-following rules.
+An \GP{avs} expression (the explicit representation of an {\av} set denoting a
+query result) is evaluated to an {\av} set according to the following rules.
\irule{x_1, \cdots, x_m \in \GP{av} \spc
x_1 \dar S_1 \spc \cdots \spc x_m \dar S_m}{}
- {x_1 \TT{;} \cdots \TT{;} x_m \dar S_1 \sum \cdots \sum S_m}
+ {x_1 \TT{;} \cdots \TT{;} x_m \dar S_1 \ssum \cdots \ssum S_m}
\irule{q \in \GP{string} \spc g_1, \cdots, g_m \in \GP{group} \spc
q\ \TT{attr}\ g_1 \dar S_1 \spc \cdots \spc
q\ \TT{attr}\ g_m \dar S_m}{}
- {q\ \TT{attr}\ g_1 \TT{,} \cdots \TT{,} g_m \dar S_1 \sum \cdots \sum S_m}
+ {q\ \TT{attr}\ g_1 \TT{,} \cdots \TT{,} g_m \dar S_1 \ssum \cdots \ssum S_m}
+\irule{q, q_0 \in \GP{string} \spc p \in \GP{path}}{}
+ {q\ \TT{attr}\ \{ p = q_0 \} \dar
+ \{(\Unquote\ q, \{ \{ (\Name\ p, \{ \Unquote\ q_0 \}) \} \})\}}
\irule{q, q_1, \cdots, q_m \in \GP{string} \spc p \in \GP{path}}{}
{q\ \TT{attr}\ \{ p = \{ q_1 \TT{,} \cdots \TT{,} q_m \} \} \dar
\{(\Unquote\ q, \{ \{ (\Name\ p, \{ \Unquote\ q_1, \cdots, \Unquote\ q_m \}) \} \})\}}
proof-checking systems, and also for learning environments because these
applications require features for classifying, searching and browsing
mathematical information in a semantically meaningful way.
Other languages to be defined in the context of the MathQL proposal may be
suitable for queries about the semantic structure of mathematical data:
this includes content-based pattern-matching and possibly other forms of
We will briefly analyze these features in the remaining part of this
\subsubsection*{The main requirements from the RDF community}
As a query language for {\RDF} databases, {\MathQL} has a well-conceived
the best usability.
The two syntaxes concern both queries and results, making {\MathQL} usable in
a distributed environment where query engines are implemented as stand-alone
-components. This is because in this setting both queries and query results
-must be exchanged by the system's components and thus need to be encoded in
-clearly defined format.
+components. In this setting in fact both the queries and their results must be
+exchanged by the system's components and thus need to be clearly encoded.
{\MathQL} provides a graph-oriented access to the {\RDF} metadata, based on
tree instantiation.
{\MathQL} query results are meant to capture the structure of trees coming
from an {\RDF} graph and for this purpose a standard $1$- or $2$-dimensional
organization (as provided by most {\RDF}-oriented query languages) is not
-satisfactory. Here {\MathQL} approach is to use a $4$-dimensional organization
+satisfactory. {\MathQL} approach is to use a $4$-dimensional organization
for its query results.
\subsubsection*{Post-processing and code generation capabilities}
The {\MathQL} query engine, that is written in {\CAML}%
functions (also available at {\CAML} side) that the expert user can define
writing suitable {\CAML} modules for the engine.
Note that the generated code is always {\MathQL} code.
The code generation features allow to build complex queries incrementally and
in an automatic manner, as required by the needs of the {\HELM} project.
Using the native programming language, instead, queries can include the
component issuing some {\MathQL} querying code followed by some {\CAML}
post-processing code is really infeasible in a distributed context.
\subsubsection*{Physical organization of the RDF database}
The implementation of the {\MathQL} query engine does not depend on any
--- /dev/null
+\section{A use case: retrieving the transitive principles}
+In this section we briefly present one on the many queries that we are using
+to test {\MathQL}.4: the one that retrieves the transitive principles stored
+in the {\HELM} library. The details on the {\RDF} metadata used to index the
+contents of the library can be found in \cite{Sch02,Gui03,GSC03}.
+This query, executed in {\MySQL}-mode on an AMD Athlon 1.5 GHz, retrieved
+$55$ {\HELM} objects (out of $41451$) in $4.00s$ (the interpreter worked
+for $0.31s$) after having issued $2205$ {\SQL} queries to the underlying
+database. This test was executed on April 2 2004 by the Author.
+\begin{footnotesize} \begin{verbatim}
+gen /"helm"/"aliases" in let $sets = property inverse /"refSort" istrue
+/"h:sort" in $SET, /"h:position" in $MC, /"h:depth" in "0" of "" in let
+$prop = property inverse /"refSort" istrue /"h:sort" in $PROP,
+/"h:position" in $MC, /"h:depth" in "2" of "" in let $rels0 = for @uri
+in $prop sup add {/"set" = property /"refObj" main /"h:occurrence" istrue
+/"h:position" in $MH of @uri} in @uri in let $rels = select @uri from
+$rels0 where ex ((count @uri./"set" eq "1") and (@uri./"set" sub $sets))
+in let $trans0 = for @uri in $rels sup add {/"rel" = @uri; /"set" = proj
+/"set" of @uri} in property inverse /"refObj" main /"h:occurrence" istrue
+/"h:position" in $MC, /"h:depth" in "5" of @uri in let $trans1 = for @uri
+in $trans0 sup add distr {/"premises" = property /"refObj" main /"h:occur
+rence" istrue /"h:position" in $MH of @uri; /"extra" = property /"refObj"
+main /"h:occurrence" istrue /"h:position" in {$IC, $IH} of @uri} in @uri
+in let $trans = select @uri from $trans1 where ex (not @uri./"extra" and
+(@uri./"premises" sub {@uri./"rel", @uri./"set"})) in keep $trans
+\end{verbatim} \end{footnotesize} %$