X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx675.html;fp=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx675.html;h=0000000000000000000000000000000000000000;hb=c7514aaa249a96c5fdd39b1123fbdb38d92f20b6;hp=cf3f4737ce506b5f6385bf097cd7614209e72a3c;hpb=1c7fb836e2af4f2f3d18afd0396701f2094265ff;p=helm.git diff --git a/helm/DEVEL/pxp/pxp/doc/manual/html/x675.html b/helm/DEVEL/pxp/pxp/doc/manual/html/x675.html deleted file mode 100644 index cf3f4737c..000000000 --- a/helm/DEVEL/pxp/pxp/doc/manual/html/x675.html +++ /dev/null @@ -1,538 +0,0 @@ -
By default, the parsed node tree consists of objects of the same class; this is -a good design as long as you want only to access selected parts of the -document. For complex transformations, it may be better to use different -classes for objects describing different element types.
For example, if the DTD declares the element types a, -b, and c, and if the task is to convert -an arbitrary document into a printable format, the idea is to define for every -element type a separate class that has a method print. The -classes are eltype_a, eltype_b, and -eltype_c, and every class implements -print such that elements of the type corresponding to the -class are converted to the output format.
The parser supports such a design directly. As it is impossible to derive -recursive classes in O'Caml[1], the specialized element classes cannot be formed by -simply inheriting from the built-in classes of the parser and adding methods -for customized functionality. To get around this limitation, every node of the -document tree is represented by two objects, one called -"the node" and containing the recursive definition of the tree, one called "the -extension". Every node object has a reference to the extension, and the -extension has a reference to the node. The advantage of this model is that it -is now possible to customize the extension without affecting the typing -constraints of the recursive node definition.
Every extension must have the three methods clone, -node, and set_node. The method -clone creates a deep copy of the extension object and -returns it; node returns the node object for this extension -object; and set_node is used to tell the extension object -which node is associated with it, this method is automatically called when the -node tree is initialized. The following definition is a good starting point -for these methods; usually clone must be further refined -when instance variables are added to the class: - -
class custom_extension = - object (self) - - val mutable node = (None : custom_extension node option) - - method clone = {< >} - method node = - match node with - None -> - assert false - | Some n -> n - method set_node n = - node <- Some n - - end- -This part of the extension is usually the same for all classes, so it is a good -idea to consider custom_extension as the super-class of the -further class definitions. Continuining the example of above, we can define the -element type classes as follows: - -
class virtual custom_extension = - object (self) - ... clone, node, set_node defined as above ... - - method virtual print : out_channel -> unit - end - -class eltype_a = - object (self) - inherit custom_extension - method print ch = ... - end - -class eltype_b = - object (self) - inherit custom_extension - method print ch = ... - end - -class eltype_c = - object (self) - inherit custom_extension - method print ch = ... - end- -The method print can now be implemented for every element -type separately. Note that you get the associated node by invoking - -
self # node- -and you get the extension object of a node n by writing - -
n # extension- -It is guaranteed that - -
self # node # extension == self- -always holds.
Here are sample definitions of the print -methods: - -
class eltype_a = - object (self) - inherit custom_extension - method print ch = - (* Nodes <a>...</a> are only containers: *) - output_string ch "("; - List.iter - (fun n -> n # extension # print ch) - (self # node # sub_nodes); - output_string ch ")"; - end - -class eltype_b = - object (self) - inherit custom_extension - method print ch = - (* Print the value of the CDATA attribute "print": *) - match self # node # attribute "print" with - Value s -> output_string ch s - | Implied_value -> output_string ch "<missing>" - | Valuelist l -> assert false - (* not possible because the att is CDATA *) - end - -class eltype_c = - object (self) - inherit custom_extension - method print ch = - (* Print the contents of this element: *) - output_string ch (self # node # data) - end - -class null_extension = - object (self) - inherit custom_extension - method print ch = assert false - end
The remaining task is to configure the parser such that these extension classes -are actually used. Here another problem arises: It is not possible to -dynamically select the class of an object to be created. As workaround, -PXP allows the user to specify exemplar objects for -the various element types; instead of creating the nodes of the tree by -applying the new operator the nodes are produced by -duplicating the exemplars. As object duplication preserves the class of the -object, one can create fresh objects of every class for which previously an -exemplar has been registered.
Exemplars are meant as objects without contents, the only interesting thing is -that exemplars are instances of a certain class. The creation of an exemplar -for an element node can be done by: - -
let element_exemplar = new element_impl extension_exemplar- -And a data node exemplar is created by: - -
let data_exemplar = new data_impl extension_exemplar- -The classes element_impl and data_impl -are defined in the module Pxp_document. The constructors -initialize the fresh objects as empty objects, i.e. without children, without -data contents, and so on. The extension_exemplar is the -initial extension object the exemplars are associated with.
Once the exemplars are created and stored somewhere (e.g. in a hash table), you -can take an exemplar and create a concrete instance (with contents) by -duplicating it. As user of the parser you are normally not concerned with this -as this is part of the internal logic of the parser, but as background knowledge -it is worthwhile to mention that the two methods -create_element and create_data actually -perform the duplication of the exemplar for which they are invoked, -additionally apply modifications to the clone, and finally return the new -object. Moreover, the extension object is copied, too, and the new node object -is associated with the fresh extension object. Note that this is the reason why -every extension object must have a clone method.
The configuration of the set of exemplars is passed to the -parse_document_entity function as third argument. In our -example, this argument can be set up as follows: - -
let spec = - make_spec_from_alist - ~data_exemplar: (new data_impl (new null_extension)) - ~default_element_exemplar: (new element_impl (new null_extension)) - ~element_alist: - [ "a", new element_impl (new eltype_a); - "b", new element_impl (new eltype_b); - "c", new element_impl (new eltype_c); - ] - ()- -The ~element_alist function argument defines the mapping -from element types to exemplars as associative list. The argument -~data_exemplar specifies the exemplar for data nodes, and -the ~default_element_exemplar is used whenever the parser -finds an element type for which the associative list does not define an -exemplar.
The configuration is now complete. You can still use the same parsing -functions, only the initialization is a bit different. For example, call the -parser by: - -
let d = parse_document_entity default_config (from_file "doc.xml") spec- -Note that the resulting document d has a usable type; -especially the print method we added is visible. So you can -print your document by - -
d # root # extension # print stdout
This object-oriented approach looks rather complicated; this is mostly caused -by working around some problems of the strict typing system of O'Caml. Some -auxiliary concepts such as extensions were needed, but the practical -consequences are low. In the next section, one of the examples of the -distribution is explained, a converter from readme -documents to HTML.
[1] | The problem is that the subclass is -usually not a subtype in this case because O'Caml has a contravariant subtyping -rule. |