X-Git-Url: http://matita.cs.unibo.it/gitweb/?a=blobdiff_plain;f=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx738.html;fp=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx738.html;h=674180172b10729a86dcee2ca6ab2cdc34156bff;hb=c03d2c1fdab8d228cb88aaba5ca0f556318bebc5;hp=0000000000000000000000000000000000000000;hpb=758057e85325f94cd88583feb1fdf6b038e35055;p=helm.git diff --git a/helm/DEVEL/pxp/pxp/doc/manual/html/x738.html b/helm/DEVEL/pxp/pxp/doc/manual/html/x738.html new file mode 100644 index 000000000..674180172 --- /dev/null +++ b/helm/DEVEL/pxp/pxp/doc/manual/html/x738.html @@ -0,0 +1,1036 @@ +
The converter from readme documents to HTML +documents follows strictly the approach to define one class per element +type. The HTML code is similar to the readme source, +because of this most elements can be converted in the following way: Given the +input element + +
<e>content</e>+ +the conversion text is the concatenation of a computed prefix, the recursively +converted content, and a computed suffix.
Only one element type cannot be handled by this scheme: +footnote. Footnotes are collected while they are found in +the input text, and they are printed after the main text has been converted and +printed.
open Pxp_types +open Pxp_document
class type footnote_printer = + object + method footnote_to_html : store_type -> out_channel -> unit + end + +and store_type = + object + method alloc_footnote : footnote_printer -> int + method print_footnotes : out_channel -> unit + end +;;
The store is a container for footnotes. You can add a +footnote by invoking alloc_footnote; the argument is an +object of the class footnote_printer, the method returns the +number of the footnote. The interesting property of a footnote is that it can +be converted to HTML, so a footnote_printer is an object +with a method footnote_to_html. The class +footnote which is defined below has a compatible method +footnote_to_html such that objects created from it can be +used as footnote_printers.
The other method, print_footnotes prints the footnotes as +definition list, and is typically invoked after the main material of the page +has already been printed. Every item of the list is printed by +footnote_to_html.
class store = + object (self) + + val mutable footnotes = ( [] : (int * footnote_printer) list ) + val mutable next_footnote_number = 1 + + method alloc_footnote n = + let number = next_footnote_number in + next_footnote_number <- number+1; + footnotes <- footnotes @ [ number, n ]; + number + + method print_footnotes ch = + if footnotes <> [] then begin + output_string ch "<hr align=left noshade=noshade width=\"30%\">\n"; + output_string ch "<dl>\n"; + List.iter + (fun (_,n) -> + n # footnote_to_html (self : #store_type :> store_type) ch) + footnotes; + output_string ch "</dl>\n"; + end + + end +;;
This function converts the characters <, >, &, and " to their HTML +representation. For example, +escape_html "<>" = "<>". Other +characters are left unchanged. + +
let escape_html s = + Str.global_substitute + (Str.regexp "<\\|>\\|&\\|\"") + (fun s -> + match Str.matched_string s with + "<" -> "<" + | ">" -> ">" + | "&" -> "&" + | "\"" -> """ + | _ -> assert false) + s +;;
This virtual class is the abstract superclass of the extension classes shown +below. It defines the standard methods clone, +node, and set_node, and declares the type +of the virtual method to_html. This method recursively +traverses the whole element tree, and prints the converted HTML code to the +output channel passed as second argument. The first argument is the reference +to the global store object which collects the footnotes. + +
class virtual shared = + object (self) + + (* --- default_ext --- *) + + val mutable node = (None : shared node option) + + method clone = {< >} + method node = + match node with + None -> + assert false + | Some n -> n + method set_node n = + node <- Some n + + (* --- virtual --- *) + + method virtual to_html : store -> out_channel -> unit + + end +;;
This class defines to_html such that the character data of +the current node is converted to HTML. Note that self is an +extension object, self # node is the node object, and +self # node # data returns the character data of the node. + +
class only_data = + object (self) + inherit shared + + method to_html store ch = + output_string ch (escape_html (self # node # data)) + end +;;
This class converts elements of type readme to HTML. Such an +element is (by definition) always the root element of the document. First, the +HTML header is printed; the title attribute of the element +determines the title of the HTML page. Some aspects of the HTML page can be +configured by setting certain parameter entities, for example the background +color, the text color, and link colors. After the header, the +body tag, and the headline have been printed, the contents +of the page are converted by invoking to_html on all +children of the current node (which is the root node). Then, the footnotes are +appended to this by telling the global store object to print +the footnotes. Finally, the end tags of the HTML pages are printed.
This class is an example how to access the value of an attribute: The value is +determined by invoking self # node # attribute "title". As +this attribute has been declared as CDATA and as being required, the value has +always the form Value s where s is the +string value of the attribute.
You can also see how entity contents can be accessed. A parameter entity object +can be looked up by self # node # dtd # par_entity "name", +and by invoking replacement_text the value of the entity +is returned after inner parameter and character entities have been +processed. Note that you must use gen_entity instead of +par_entity to access general entities.
class readme = + object (self) + inherit shared + + method to_html store ch = + (* output header *) + output_string + ch "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final//EN\">"; + output_string + ch "<!-- WARNING! This is a generated file, do not edit! -->\n"; + let title = + match self # node # attribute "title" with + Value s -> s + | _ -> assert false + in + let html_header, _ = + try (self # node # dtd # par_entity "readme:html:header") + # replacement_text + with WF_error _ -> "", false in + let html_trailer, _ = + try (self # node # dtd # par_entity "readme:html:trailer") + # replacement_text + with WF_error _ -> "", false in + let html_bgcolor, _ = + try (self # node # dtd # par_entity "readme:html:bgcolor") + # replacement_text + with WF_error _ -> "white", false in + let html_textcolor, _ = + try (self # node # dtd # par_entity "readme:html:textcolor") + # replacement_text + with WF_error _ -> "", false in + let html_alinkcolor, _ = + try (self # node # dtd # par_entity "readme:html:alinkcolor") + # replacement_text + with WF_error _ -> "", false in + let html_vlinkcolor, _ = + try (self # node # dtd # par_entity "readme:html:vlinkcolor") + # replacement_text + with WF_error _ -> "", false in + let html_linkcolor, _ = + try (self # node # dtd # par_entity "readme:html:linkcolor") + # replacement_text + with WF_error _ -> "", false in + let html_background, _ = + try (self # node # dtd # par_entity "readme:html:background") + # replacement_text + with WF_error _ -> "", false in + + output_string ch "<html><header><title>\n"; + output_string ch (escape_html title); + output_string ch "</title></header>\n"; + output_string ch "<body "; + List.iter + (fun (name,value) -> + if value <> "" then + output_string ch (name ^ "=\"" ^ escape_html value ^ "\" ")) + [ "bgcolor", html_bgcolor; + "text", html_textcolor; + "link", html_linkcolor; + "alink", html_alinkcolor; + "vlink", html_vlinkcolor; + ]; + output_string ch ">\n"; + output_string ch html_header; + output_string ch "<h1>"; + output_string ch (escape_html title); + output_string ch "</h1>\n"; + (* process main content: *) + List.iter + (fun n -> n # extension # to_html store ch) + (self # node # sub_nodes); + (* now process footnotes *) + store # print_footnotes ch; + (* trailer *) + output_string ch html_trailer; + output_string ch "</html>\n"; + + end +;;
As the conversion process is very similar, the conversion classes of the three +section levels are derived from the more general section +class. The HTML code of the section levels only differs in the type of the +headline, and because of this the classes describing the section levels can be +computed by replacing the class argument the_tag of +section by the HTML name of the headline tag.
Section elements are converted to HTML by printing a headline and then +converting the contents of the element recursively. More precisely, the first +sub-element is always a title element, and the other +elements are the contents of the section. This structure is declared in the +DTD, and it is guaranteed that the document matches the DTD. Because of this +the title node can be separated from the rest without any checks.
Both the title node, and the body nodes are then converted to HTML by calling +to_html on them.
class section the_tag = + object (self) + inherit shared + + val tag = the_tag + + method to_html store ch = + let sub_nodes = self # node # sub_nodes in + match sub_nodes with + title_node :: rest -> + output_string ch ("<" ^ tag ^ ">\n"); + title_node # extension # to_html store ch; + output_string ch ("\n</" ^ tag ^ ">"); + List.iter + (fun n -> n # extension # to_html store ch) + rest + | _ -> + assert false + end +;; + +class sect1 = section "h1";; +class sect2 = section "h3";; +class sect3 = section "h4";;
Several element types are converted to HTML by simply mapping them to +corresponding HTML element types. The class map_tag +implements this, and the class argument the_target_tag +determines the tag name to map to. The output consists of the start tag, the +recursively converted inner elements, and the end tag. + +
class map_tag the_target_tag = + object (self) + inherit shared + + val target_tag = the_target_tag + + method to_html store ch = + output_string ch ("<" ^ target_tag ^ ">\n"); + List.iter + (fun n -> n # extension # to_html store ch) + (self # node # sub_nodes); + output_string ch ("\n</" ^ target_tag ^ ">"); + end +;; + +class p = map_tag "p";; +class em = map_tag "b";; +class ul = map_tag "ul";; +class li = map_tag "li";;
Element of type br are mapped to the same HTML type. Note +that HTML forbids the end tag of br. + +
class br = + object (self) + inherit shared + + method to_html store ch = + output_string ch "<br>\n"; + List.iter + (fun n -> n # extension # to_html store ch) + (self # node # sub_nodes); + end +;;
The code type is converted to a pre +section (preformatted text). As the meaning of tabs is unspecified in HTML, +tabs are expanded to spaces. + +
class code = + object (self) + inherit shared + + method to_html store ch = + let data = self # node # data in + (* convert tabs *) + let l = String.length data in + let rec preprocess i column = + (* this is very ineffective but comprehensive: *) + if i < l then + match data.[i] with + '\t' -> + let n = 8 - (column mod 8) in + String.make n ' ' ^ preprocess (i+1) (column + n) + | '\n' -> + "\n" ^ preprocess (i+1) 0 + | c -> + String.make 1 c ^ preprocess (i+1) (column + 1) + else + "" + in + output_string ch "<p><pre>"; + output_string ch (escape_html (preprocess 0 0)); + output_string ch "</pre></p>"; + + end +;;
Hyperlinks, expressed by the a element type, are converted +to the HTML a type. If the target of the hyperlink is given +by href, the URL of this attribute can be used +directly. Alternatively, the target can be given by +readmeref in which case the ".html" suffix must be added to +the file name.
Note that within a only #PCDATA is allowed, so the contents +can be converted directly by applying escape_html to the +character data contents. + +
class a = + object (self) + inherit shared + + method to_html store ch = + output_string ch "<a "; + let href = + match self # node # attribute "href" with + Value v -> escape_html v + | Valuelist _ -> assert false + | Implied_value -> + begin match self # node # attribute "readmeref" with + Value v -> escape_html v ^ ".html" + | Valuelist _ -> assert false + | Implied_value -> + "" + end + in + if href <> "" then + output_string ch ("href=\"" ^ href ^ "\""); + output_string ch ">"; + output_string ch (escape_html (self # node # data)); + output_string ch "</a>"; + + end +;;
The footnote class has two methods: +to_html to convert the footnote reference to HTML, and +footnote_to_html to convert the footnote text itself.
The footnote reference is converted to a local hyperlink; more precisely, to +two anchor tags which are connected with each other. The text anchor points to +the footnote anchor, and the footnote anchor points to the text anchor.
The footnote must be allocated in the store object. By +allocating the footnote, you get the number of the footnote, and the text of +the footnote is stored until the end of the HTML page is reached when the +footnotes can be printed. The to_html method stores simply +the object itself, such that the footnote_to_html method is +invoked on the same object that encountered the footnote.
The to_html only allocates the footnote, and prints the +reference anchor, but it does not print nor convert the contents of the +note. This is deferred until the footnotes actually get printed, i.e. the +recursive call of to_html on the sub nodes is done by +footnote_to_html.
Note that this technique does not work if you make another footnote within a +footnote; the second footnote gets allocated but not printed.
class footnote = + object (self) + inherit shared + + val mutable footnote_number = 0 + + method to_html store ch = + let number = + store # alloc_footnote (self : #shared :> footnote_printer) in + let foot_anchor = + "footnote" ^ string_of_int number in + let text_anchor = + "textnote" ^ string_of_int number in + footnote_number <- number; + output_string ch ( "<a name=\"" ^ text_anchor ^ "\" href=\"#" ^ + foot_anchor ^ "\">[" ^ string_of_int number ^ + "]</a>" ) + + method footnote_to_html store ch = + (* prerequisite: we are in a definition list <dl>...</dl> *) + let foot_anchor = + "footnote" ^ string_of_int footnote_number in + let text_anchor = + "textnote" ^ string_of_int footnote_number in + output_string ch ("<dt><a name=\"" ^ foot_anchor ^ "\" href=\"#" ^ + text_anchor ^ "\">[" ^ string_of_int footnote_number ^ + "]</a></dt>\n<dd>"); + List.iter + (fun n -> n # extension # to_html store ch) + (self # node # sub_nodes); + output_string ch ("\n</dd>") + + end +;;
This code sets up the hash table that connects element types with the exemplars +of the extension classes that convert the elements to HTML. + +
open Pxp_yacc + +let tag_map = + make_spec_from_alist + ~data_exemplar:(new data_impl (new only_data)) + ~default_element_exemplar:(new element_impl (new no_markup)) + ~element_alist: + [ "readme", (new element_impl (new readme)); + "sect1", (new element_impl (new sect1)); + "sect2", (new element_impl (new sect2)); + "sect3", (new element_impl (new sect3)); + "title", (new element_impl (new no_markup)); + "p", (new element_impl (new p)); + "br", (new element_impl (new br)); + "code", (new element_impl (new code)); + "em", (new element_impl (new em)); + "ul", (new element_impl (new ul)); + "li", (new element_impl (new li)); + "footnote", (new element_impl (new footnote : #shared :> shared)); + "a", (new element_impl (new a)); + ] + () +;;