X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx738.html;fp=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx738.html;h=674180172b10729a86dcee2ca6ab2cdc34156bff;hb=c03d2c1fdab8d228cb88aaba5ca0f556318bebc5;hp=0000000000000000000000000000000000000000;hpb=758057e85325f94cd88583feb1fdf6b038e35055;p=helm.git diff --git a/helm/DEVEL/pxp/pxp/doc/manual/html/x738.html b/helm/DEVEL/pxp/pxp/doc/manual/html/x738.html new file mode 100644 index 000000000..674180172 --- /dev/null +++ b/helm/DEVEL/pxp/pxp/doc/manual/html/x738.html @@ -0,0 +1,1036 @@ +Example: An HTML backend for the readme +DTD
The PXP user's guide
PrevChapter 2. Using PXPNext

2.4. Example: An HTML backend for the readme +DTD

The converter from readme documents to HTML +documents follows strictly the approach to define one class per element +type. The HTML code is similar to the readme source, +because of this most elements can be converted in the following way: Given the +input element + +

<e>content</e>
+ +the conversion text is the concatenation of a computed prefix, the recursively +converted content, and a computed suffix.

Only one element type cannot be handled by this scheme: +footnote. Footnotes are collected while they are found in +the input text, and they are printed after the main text has been converted and +printed.

2.4.1. Header

open Pxp_types
+open Pxp_document

2.4.2. Type declarations

class type footnote_printer =
+  object
+    method footnote_to_html : store_type -> out_channel -> unit
+  end
+
+and store_type =
+  object
+    method alloc_footnote : footnote_printer -> int
+    method print_footnotes : out_channel -> unit
+  end
+;;

2.4.3. Class store

The store is a container for footnotes. You can add a +footnote by invoking alloc_footnote; the argument is an +object of the class footnote_printer, the method returns the +number of the footnote. The interesting property of a footnote is that it can +be converted to HTML, so a footnote_printer is an object +with a method footnote_to_html. The class +footnote which is defined below has a compatible method +footnote_to_html such that objects created from it can be +used as footnote_printers.

The other method, print_footnotes prints the footnotes as +definition list, and is typically invoked after the main material of the page +has already been printed. Every item of the list is printed by +footnote_to_html.

class store =
+  object (self)
+
+    val mutable footnotes = ( [] : (int * footnote_printer) list )
+    val mutable next_footnote_number = 1
+
+    method alloc_footnote n =
+      let number = next_footnote_number in
+      next_footnote_number <- number+1;
+      footnotes <- footnotes @ [ number, n ];
+      number
+
+    method print_footnotes ch =
+      if footnotes <> [] then begin
+	output_string ch "<hr align=left noshade=noshade width=\"30%\">\n";
+	output_string ch "<dl>\n";
+	List.iter
+	  (fun (_,n) -> 
+	     n # footnote_to_html (self : #store_type :> store_type) ch)
+	  footnotes;
+	output_string ch "</dl>\n";
+      end
+
+  end
+;;

2.4.4. Function escape_html

This function converts the characters <, >, &, and " to their HTML +representation. For example, +escape_html "<>" = "&lt;&gt;". Other +characters are left unchanged. + +

let escape_html s =
+  Str.global_substitute
+    (Str.regexp "<\\|>\\|&\\|\"")
+    (fun s ->
+      match Str.matched_string s with
+        "<" -> "&lt;"
+      | ">" -> "&gt;"
+      | "&" -> "&amp;"
+      | "\"" -> "&quot;"
+      | _ -> assert false)
+    s
+;;

2.4.5. Virtual class shared

This virtual class is the abstract superclass of the extension classes shown +below. It defines the standard methods clone, +node, and set_node, and declares the type +of the virtual method to_html. This method recursively +traverses the whole element tree, and prints the converted HTML code to the +output channel passed as second argument. The first argument is the reference +to the global store object which collects the footnotes. + +

class virtual shared =
+  object (self)
+
+    (* --- default_ext --- *)
+
+    val mutable node = (None : shared node option)
+
+    method clone = {< >} 
+    method node =
+      match node with
+          None ->
+            assert false
+        | Some n -> n
+    method set_node n =
+      node <- Some n
+
+    (* --- virtual --- *)
+
+    method virtual to_html : store -> out_channel -> unit
+
+  end
+;;

2.4.6. Class only_data

This class defines to_html such that the character data of +the current node is converted to HTML. Note that self is an +extension object, self # node is the node object, and +self # node # data returns the character data of the node. + +

class only_data =
+  object (self)
+    inherit shared
+
+    method to_html store ch =
+      output_string ch (escape_html (self # node # data))
+  end
+;;

2.4.7. Class readme

This class converts elements of type readme to HTML. Such an +element is (by definition) always the root element of the document. First, the +HTML header is printed; the title attribute of the element +determines the title of the HTML page. Some aspects of the HTML page can be +configured by setting certain parameter entities, for example the background +color, the text color, and link colors. After the header, the +body tag, and the headline have been printed, the contents +of the page are converted by invoking to_html on all +children of the current node (which is the root node). Then, the footnotes are +appended to this by telling the global store object to print +the footnotes. Finally, the end tags of the HTML pages are printed.

This class is an example how to access the value of an attribute: The value is +determined by invoking self # node # attribute "title". As +this attribute has been declared as CDATA and as being required, the value has +always the form Value s where s is the +string value of the attribute.

You can also see how entity contents can be accessed. A parameter entity object +can be looked up by self # node # dtd # par_entity "name", +and by invoking replacement_text the value of the entity +is returned after inner parameter and character entities have been +processed. Note that you must use gen_entity instead of +par_entity to access general entities.

class readme =
+  object (self)
+    inherit shared
+
+    method to_html store ch =
+      (* output header *)
+      output_string 
+	ch "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final//EN\">";
+      output_string
+	ch "<!-- WARNING! This is a generated file, do not edit! -->\n";
+      let title = 
+	match self # node # attribute "title" with
+	    Value s -> s
+	  | _ -> assert false
+      in
+      let html_header, _ =
+	try (self # node # dtd # par_entity "readme:html:header") 
+            # replacement_text
+	with WF_error _ -> "", false in
+      let html_trailer, _ =
+	try (self # node # dtd # par_entity "readme:html:trailer")
+            # replacement_text
+	with WF_error _ -> "", false in
+      let html_bgcolor, _ =
+	try (self # node # dtd # par_entity "readme:html:bgcolor")
+            # replacement_text
+	with WF_error _ -> "white", false in
+      let html_textcolor, _ =
+	try (self # node # dtd # par_entity "readme:html:textcolor")
+            # replacement_text
+	with WF_error _ -> "", false in
+      let html_alinkcolor, _ =
+	try (self # node # dtd # par_entity "readme:html:alinkcolor")
+            # replacement_text
+	with WF_error _ -> "", false in
+      let html_vlinkcolor, _ =
+	try (self # node # dtd # par_entity "readme:html:vlinkcolor")
+            # replacement_text
+	with WF_error _ -> "", false in
+      let html_linkcolor, _ =
+	try (self # node # dtd # par_entity "readme:html:linkcolor")
+            # replacement_text
+	with WF_error _ -> "", false in
+      let html_background, _ =
+	try (self # node # dtd # par_entity "readme:html:background")
+            # replacement_text
+	with WF_error _ -> "", false in
+
+      output_string ch "<html><header><title>\n";
+      output_string ch (escape_html title);
+      output_string ch "</title></header>\n";
+      output_string ch "<body ";
+      List.iter
+	(fun (name,value) ->
+	   if value <> "" then 
+	     output_string ch (name ^ "=\"" ^ escape_html value ^ "\" "))
+	[ "bgcolor",    html_bgcolor;
+	  "text",       html_textcolor;
+	  "link",       html_linkcolor;
+	  "alink",      html_alinkcolor;
+	  "vlink",      html_vlinkcolor;
+	];
+      output_string ch ">\n";
+      output_string ch html_header;
+      output_string ch "<h1>";
+      output_string ch (escape_html title);
+      output_string ch "</h1>\n";
+      (* process main content: *)
+      List.iter
+	(fun n -> n # extension # to_html store ch)
+	(self # node # sub_nodes);
+      (* now process footnotes *)
+      store # print_footnotes ch;
+      (* trailer *)
+      output_string ch html_trailer;
+      output_string ch "</html>\n";
+
+  end
+;;

2.4.8. Classes section, sect1, +sect2, and sect3

As the conversion process is very similar, the conversion classes of the three +section levels are derived from the more general section +class. The HTML code of the section levels only differs in the type of the +headline, and because of this the classes describing the section levels can be +computed by replacing the class argument the_tag of +section by the HTML name of the headline tag.

Section elements are converted to HTML by printing a headline and then +converting the contents of the element recursively. More precisely, the first +sub-element is always a title element, and the other +elements are the contents of the section. This structure is declared in the +DTD, and it is guaranteed that the document matches the DTD. Because of this +the title node can be separated from the rest without any checks.

Both the title node, and the body nodes are then converted to HTML by calling +to_html on them.

class section the_tag =
+  object (self)
+    inherit shared
+
+    val tag = the_tag
+
+    method to_html store ch =
+      let sub_nodes = self # node # sub_nodes in
+      match sub_nodes with
+	  title_node :: rest ->
+	    output_string ch ("<" ^ tag ^ ">\n");
+	    title_node # extension # to_html store ch;
+	    output_string ch ("\n</" ^ tag ^ ">");
+	    List.iter
+	      (fun n -> n # extension # to_html store ch)
+	      rest
+	| _ ->
+	    assert false
+  end
+;;
+
+class sect1 = section "h1";;
+class sect2 = section "h3";;
+class sect3 = section "h4";;

2.4.9. Classes map_tag, p, +em, ul, li

Several element types are converted to HTML by simply mapping them to +corresponding HTML element types. The class map_tag +implements this, and the class argument the_target_tag +determines the tag name to map to. The output consists of the start tag, the +recursively converted inner elements, and the end tag. + +

class map_tag the_target_tag =
+  object (self)
+    inherit shared
+
+    val target_tag = the_target_tag
+
+    method to_html store ch =
+      output_string ch ("<" ^ target_tag ^ ">\n");
+      List.iter
+	(fun n -> n # extension # to_html store ch)
+	(self # node # sub_nodes);
+      output_string ch ("\n</" ^ target_tag ^ ">");
+  end
+;;
+
+class p = map_tag "p";;
+class em = map_tag "b";;
+class ul = map_tag "ul";;
+class li = map_tag "li";;

2.4.10. Class br

Element of type br are mapped to the same HTML type. Note +that HTML forbids the end tag of br. + +

class br =
+  object (self)
+    inherit shared
+
+    method to_html store ch =
+      output_string ch "<br>\n";
+      List.iter
+	(fun n -> n # extension # to_html store ch)
+	(self # node # sub_nodes);
+  end
+;;

2.4.11. Class code

The code type is converted to a pre +section (preformatted text). As the meaning of tabs is unspecified in HTML, +tabs are expanded to spaces. + +

class code =
+  object (self)
+    inherit shared
+
+    method to_html store ch =
+      let data = self # node # data in
+      (* convert tabs *)
+      let l = String.length data in
+      let rec preprocess i column =
+	(* this is very ineffective but comprehensive: *)
+	if i < l then
+	  match data.[i] with
+	      '\t' ->
+		let n = 8 - (column mod 8) in
+		String.make n ' ' ^ preprocess (i+1) (column + n)
+	    | '\n' ->
+		"\n" ^ preprocess (i+1) 0
+	    | c ->
+		String.make 1 c ^ preprocess (i+1) (column + 1)
+	else
+	  ""
+      in
+      output_string ch "<p><pre>";
+      output_string ch (escape_html (preprocess 0 0));
+      output_string ch "</pre></p>";
+
+  end
+;;

2.4.12. Class a

Hyperlinks, expressed by the a element type, are converted +to the HTML a type. If the target of the hyperlink is given +by href, the URL of this attribute can be used +directly. Alternatively, the target can be given by +readmeref in which case the ".html" suffix must be added to +the file name.

Note that within a only #PCDATA is allowed, so the contents +can be converted directly by applying escape_html to the +character data contents. + +

class a =
+  object (self)
+    inherit shared
+
+    method to_html store ch =
+      output_string ch "<a ";
+      let href =
+	match self # node # attribute "href" with
+	    Value v -> escape_html v
+	  | Valuelist _ -> assert false
+	  | Implied_value ->
+	      begin match self # node # attribute "readmeref" with
+		  Value v -> escape_html v ^ ".html"
+		| Valuelist _ -> assert false
+		| Implied_value ->
+		    ""
+	      end
+      in
+      if href <> "" then
+	output_string ch ("href=\""  ^ href ^ "\"");
+      output_string ch ">";
+      output_string ch (escape_html (self # node # data));
+      output_string ch "</a>";
+	
+  end
+;;

2.4.13. Class footnote

The footnote class has two methods: +to_html to convert the footnote reference to HTML, and +footnote_to_html to convert the footnote text itself.

The footnote reference is converted to a local hyperlink; more precisely, to +two anchor tags which are connected with each other. The text anchor points to +the footnote anchor, and the footnote anchor points to the text anchor.

The footnote must be allocated in the store object. By +allocating the footnote, you get the number of the footnote, and the text of +the footnote is stored until the end of the HTML page is reached when the +footnotes can be printed. The to_html method stores simply +the object itself, such that the footnote_to_html method is +invoked on the same object that encountered the footnote.

The to_html only allocates the footnote, and prints the +reference anchor, but it does not print nor convert the contents of the +note. This is deferred until the footnotes actually get printed, i.e. the +recursive call of to_html on the sub nodes is done by +footnote_to_html.

Note that this technique does not work if you make another footnote within a +footnote; the second footnote gets allocated but not printed.

class footnote =
+  object (self)
+    inherit shared
+
+    val mutable footnote_number = 0
+
+    method to_html store ch =
+      let number = 
+	store # alloc_footnote (self : #shared :> footnote_printer) in
+      let foot_anchor = 
+	"footnote" ^ string_of_int number in
+      let text_anchor =
+	"textnote" ^ string_of_int number in
+      footnote_number <- number;
+      output_string ch ( "<a name=\"" ^ text_anchor ^ "\" href=\"#" ^ 
+			 foot_anchor ^ "\">[" ^ string_of_int number ^ 
+			 "]</a>" )
+
+    method footnote_to_html store ch =
+      (* prerequisite: we are in a definition list <dl>...</dl> *)
+      let foot_anchor = 
+	"footnote" ^ string_of_int footnote_number in
+      let text_anchor =
+	"textnote" ^ string_of_int footnote_number in
+      output_string ch ("<dt><a name=\"" ^ foot_anchor ^ "\" href=\"#" ^ 
+			text_anchor ^ "\">[" ^ string_of_int footnote_number ^ 
+			"]</a></dt>\n<dd>");
+      List.iter
+	(fun n -> n # extension # to_html store ch)
+	(self # node # sub_nodes);
+      output_string ch ("\n</dd>")
+ 
+  end
+;;

2.4.14. The specification of the document model

This code sets up the hash table that connects element types with the exemplars +of the extension classes that convert the elements to HTML. + +

open Pxp_yacc
+
+let tag_map =
+  make_spec_from_alist
+    ~data_exemplar:(new data_impl (new only_data))
+    ~default_element_exemplar:(new element_impl (new no_markup))
+    ~element_alist:
+      [ "readme", (new element_impl (new readme));
+	"sect1",  (new element_impl (new sect1));
+	"sect2",  (new element_impl (new sect2));
+	"sect3",  (new element_impl (new sect3));
+	"title",  (new element_impl (new no_markup));
+	"p",      (new element_impl (new p));
+	"br",     (new element_impl (new br));
+	"code",   (new element_impl (new code));
+	"em",     (new element_impl (new em));
+	"ul",     (new element_impl (new ul));
+	"li",     (new element_impl (new li));
+	"footnote", (new element_impl (new footnote : #shared :> shared));
+	"a",      (new element_impl (new a));
+      ]
+    ()
+;;


PrevHomeNext
Class-based processing of the node treeUpThe objects representing the document
\ No newline at end of file