1 \subsection {Sets of attributed values.} \label{AVSets}
3 The data representation model used by {\MathQL} relies on the notion of
4 \emph{set of attributed values} ({\av} set for short) that is, in practice,
5 the only data type available in {\MathQL}.4. In this sense {\MathQL}.4 is a
6 statically untyped language.%
8 {A type system that fits {\MathQL} as an {\RDF}-oriented query language,
9 should be driven from the {\RDFS} class system. This may be a future
11 Each {\av} in an {\av} set consists of a string%
12 \footnote{When we say \emph{string}, we mean a finite sequence of characters.}
13 (that we call the \emph{head string} or \emph{value}) and a (possibly empty)
14 multiset of named attributes whose content is a set of strings.
15 Attribute names are made of a (possibly empty) list of string components, so
16 they can be hierarchically structured.
17 Moreover the attributes of a value are partitioned into a set of \emph{groups}
18 ({\ie} subsets) to improve its structure.
20 In the above description a \emph{set} is an \emph{unordered} finite
21 sequence \emph{without} repetitions whereas a \emph{multiset} is an
22 \emph{unordered} finite sequence \emph{with} repetitions.
24 In the present context repetitions are defined as follows:
25 two {\av}'s are repeated if they share the same head string without any
26 condition on their attributes, two groups are repeated of they contain the
27 same attributes (equal both in name and content), two attributes of a group
28 are repeated if they share the same name without any condition on their
29 content, and two strings are always compared in a case-sensitive manner.%
31 {The Author's experience with {\MathQL} seems to show that the above
32 definition of an {\av} set is just the right one among the many alternatives
35 As we said, {\MathQL}.4 uses {\av} sets to represent many kinds of
41 A pool of {\RDF} triples having a common subject $r$, which in general is a
42 {\URI} reference \cite{URI}%
44 {A {\URI} \emph {reference} is a {\URI} with an optional fragment identifier.},
45 is encoded in a single {\av} placing $r$ in the head string.
46 The predicates of the triples are encoded as attribute names and their objects
47 are placed in the attributes' contents.
48 These contents are structured as multiple strings with the aim of holding the
49 objects of repeated predicates.
50 Moreover structured attribute names can encode various components of
51 structured properties preserving their semantics.
54 \begin{footnotesize} \begin{verbatim}
56 ("protocol", "dc:creator", "Sandro Hawke")
57 ("protocol", "dc:creator", "Eric Prud'hommeaux")
58 ("protocol", "dc:date", "2002-01-08")
60 The corresponding attributed value:
61 "protocol" attr {/"dc:creator" = {"Sandro Hawke", "Eric Prud'hommeaux"};
62 /"dc:date" = "2002-01-08"}
63 \end{verbatim} \end{footnotesize}
65 \caption{The representation of a pool of {\RDF} triples} \label{AVOne}
68 \figref{AVOne} shows how a set of triples can be coded in an {\av}.
69 Note that the word \TT{attr} separates the head string from its attributes,
70 braces enclose an attribute group in which attributes are separated by
71 semicolons, and an equal sign separates an attribute name from its contents.
73 In this setting the grouping feature can be used to separate semantically
74 different classes of properties associated to a resource (as for instance
75 Dublin Core metadata, Euler metadata and user-defined metadata).
78 A pool of arbitrarily chosen {\RDF} triples is encoded in an {\av} set
79 placing in each {\av} the subset of triples sharing the same head string.
81 Note that the use of {\av} sets to build query results allows {\MathQL} queries
82 to return sets of {\RDF} triples instead of mere sets of resources, in the
83 spirit of what is currently done by other {\RDF}-oriented query languages.
85 If the {\av}'s of an {\av} set share the same attribute names and grouping
86 structure, this set can be represented as a table in which each row encodes
87 an {\av} and each column is associated to an attribute (except the first one
88 which holds the head strings).
89 \figref{Table} shows an {\av} set describing the properties of two resources
90 ``A'' and ``B'' giving its table representation, in which the columns
91 corresponding to attributes in the same group are clustered between
92 double-line delimiters.%
93 \footnote{A table with grouped labelled columns like the one above resembles a
94 set of relational database tables.}
97 \begin{footnotesize} \begin{verbatim}
98 "A" attr {/"major" = "1"; /"minor" = "2"},
99 {/"first" = "2002-01-01"; /"modified" = "2002-03-01"};
100 "B" attr {/"major" = "1"; /"minor" = "7"},
101 {/"first" = "2002-02-01"; /"modified" = "2002-04-01"}
103 \begin{center} \begin{tabular}{|c||c|c||c|c||}
104 \hline & \textbf{``major''} & \textbf{``minor''} & \textbf{``first''} & \textbf{``modified''} \\
105 \hline ``A'' & ``1'' & ``2'' & ``2002-01-01'' & ``2002-03-01'' \\
106 \hline ``B'' & ``1'' & ``7'' & ``2002-02-01'' & ``2002-04-01'' \\
108 \end{tabular} \end{center} \end{footnotesize}
109 \caption{A set of attributed values displayed as a table} \label{Table}
112 The above example gives a spatial idea of the geometry of an {\av} set ({\ie}
113 a query result) which fits in 4 dimensions: namely we can extend independently
114 the set of the head strings (dimension 1), the attributes in each group
115 (dimension 2), the groups in each {\av} (dimension 3) and the contents of each
116 attribute (dimension 4).
117 The metadata defined in the table of \figref{Table} will be used in subsequent
119 For this purpose assume that ``first'' and ``modified'' are the components
120 of a structured property ``date'' available for the resources ``A'' and ``B''.
123 The value of an {\RDF} property is encoded in an {\av} distinguishing three
129 If the property is unstructured, its value is placed in the {\av} head
130 string and no attributes are defined.
133 If the property is structured and its value has a main component%
134 \footnote{Which is set by the \emph{rdf:value} property or defined by a
135 specific application.},
136 the content of this component is placed in the {\av} head string and the
137 other components are stored in the {\av} attributes as in the case 1.
140 For the value of a structured property without a main component, the head
141 string is empty and the components are stored in the attributes.
146 \begin{footnotesize} \begin{verbatim}
147 First example, one instance:
148 "" attr {/"major" = "1"; /"minor" = "2"} no main component
149 "1" attr {/"minor" = "2"} main component is "major"
150 "2" attr {/"major" = "1"} main component is "minor"
152 Second example: two separate instances:
153 "" attr {/"major" = "1"; /"minor" = "2"},
154 {/"major" = "1"; /"minor" = "7"} no main component
155 "1" attr {/"minor" = "2"}, {/"minor" = "7"} main component is "major"
157 Third example: two mixed instances:
158 "" attr {/"major" = "3", "6"; /"minor" = {"4", "9"}} no main component
159 \end{verbatim} \end{footnotesize}
161 \caption{The representation of the structured value of a property}
165 \figref{AVTwo} (first example) shows three possible ways of representing in
166 {\av}'s an instance of a structured property ``id'' whose value has two
167 fields ({\ie} properties) ``major'' and ``minor''.
168 In this instance, ``major'' is set to ``1'' and ``minor'' is set to ``2''.
169 The representations depend on which component of ``id'' is chosen as the
170 main component (none, ``major'' or ``minor'' respectively).
171 Several structured property values sharing a common main component can be
172 encodes in a single {\av} exploiting the grouping facility: in this case the
173 attributes of every instance are enclosed in separate groups.
174 \figref{AVTwo} (second example) shows the representations of two instances of
175 ``id'': the former and a new one for which ``major'' is ``1'' and ``minor'' is
178 Note that if the attributes of the two groups are encoded in a single group,
179 the notion of which components belong to the same property value can not be
180 recovered in the general case because the values of an attribute form a set
181 and thus are unordered.
182 As an example think of two instances of ``id'' encoded as in \figref{AVTwo}
186 A natural number is stored, using its decimal representation, in the head
187 string of a single {\av} with no attributes.
190 The boolean value \emph{false} is stored as an empty {\av} set, whereas
191 an inhabited {\av} set may be interpreted as the boolean value \emph{true}.
192 The default representation of \emph{true} is a single {\av} with an empty
193 head string and no attributes.
197 {\MathQL} defines five core binary operations on {\av} sets: two unions, two
198 intersections and a difference. The first four are defined in terms of an
199 operation, that we call \emph{addition}, involving two {\av}'s with the same
201 The result is an {\av} with the same head string of the operands but there are
202 two ways to compose the attribute groups:
207 with the \emph{set-theoretic} addition, the set of attribute groups in the
208 resulting {\av} is the set-theoretic union of the sets of attribute groups in
212 with the \emph{distributive} addition, the set of attribute groups in the
213 resulting {\av} is the ``Cartesian product'' of the sets of attribute groups
215 Here an element of the ``Cartesian product'' is not a pair of groups but it is
216 the set-theoretic union of these groups where the contents of homonymous
217 attributes are clustered together using set-theoretic unions.
221 \figref{Addition} shows an example of the two kinds of addition.
224 \begin{footnotesize} \begin{verbatim}
225 Attributed values used as operands for the addition:
226 "1" attr {/"A" = "a"}, {/"B" = "b1"}
227 "1" attr {/"A" = "a"}, {/"B" = "b2"}
229 Set-theoretic addition:
230 " 1" attr {/"A" = "a"}, {/"B" = "b1"}, {/"B" = "b2"}
232 Distributive addition:
233 "1" attr {/"A" = "a"}, {/"A" = "a"; /"B" = "b2"},
234 {/"B" = "b1"; /"A" = "a"}, {/"B" = {"b1", "b2"}}
235 \end{verbatim} \end{footnotesize}
237 \caption{The addition of attributed values}
241 Now we can discuss the five operations between {\av} sets:
246 The two unions corresponds to the set-theoretic union of their operand where
247 the {\av}'s sharing the head string are added either set-theoretically or
248 distributively as explained above (thus we have a set-theoretic union and a
249 distributive union in the two cases). In this context the empty {\av} set
250 plays the role of the neutral element.
251 These operations play a central role {\MathQL} architecture and allow to
252 compose the attributes of the operands preserving their group structure.
255 The two intersections are the dual of the above unions: they contain the
256 {\av}'s whose head string appears in each argument where the {\av}'s sharing
257 the head string are added either set-theoretically or distributively as before.
259 The distributive intersection has the double benefit of filtering the
260 common values of the given {\av} sets, and of merging their attribute groups
261 in every possible way. This feature enables the possibility of performing
262 additional filtering operations checking the content of the merged groups.
265 The difference of two {\av} sets contains the {\av}'s of the first
266 argument whose head string does not appear in the second argument.
270 \figref{Binary} shows how the above operations work in a simple example.
273 \begin{footnotesize} \begin{verbatim}
274 Sets of attributed values used as operands for the operations:
275 "1" attr {/"A" = "a"}; "2" attr {/"B" = "b1"}
276 "2" attr {/"B" = "b2"}
279 "1" attr {/"A" = "a"}; "2" attr {/"B" = "b1"}, {/"B" = "b2"}
282 "1" attr {/"A" = "a"}; "2" attr {/"B" = {"b1", "b2"}}
284 Set-theoretic intersection:
285 "2" attr {/"B" = "b1"}, {/"B" = "b2"}
287 Distributive intersection:
288 "2" attr {/"B" = {"b1", "b2"}}
291 "1" attr {/"A" = "a"}
292 \end{verbatim} \end{footnotesize}
294 \caption{The binary operations on sets of attributed values}