X-Git-Url: http://matita.cs.unibo.it/gitweb/?p=helm.git;a=blobdiff_plain;f=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx1629.html;fp=helm%2FDEVEL%2Fpxp%2Fpxp%2Fdoc%2Fmanual%2Fhtml%2Fx1629.html;h=0000000000000000000000000000000000000000;hp=06b1e60ea5caac67fb51a5aabe19d8341e6a6735;hb=3ef089a4c58fbe429dd539af6215991ecbe11ee2;hpb=1c7fb836e2af4f2f3d18afd0396701f2094265ff diff --git a/helm/DEVEL/pxp/pxp/doc/manual/html/x1629.html b/helm/DEVEL/pxp/pxp/doc/manual/html/x1629.html deleted file mode 100644 index 06b1e60ea..000000000 --- a/helm/DEVEL/pxp/pxp/doc/manual/html/x1629.html +++ /dev/null @@ -1,895 +0,0 @@ -Resolvers and sources
The PXP user's guide
PrevChapter 4. Configuring and calling the parserNext

4.2. Resolvers and sources

4.2.1. Using the built-in resolvers (called sources)

The type source enumerates the two -possibilities where the document to parse comes from. - -

type source =
-    Entity of ((dtd -> Pxp_entity.entity) * Pxp_reader.resolver)
-  | ExtID of (ext_id * Pxp_reader.resolver)
- -You normally need not to worry about this type as there are convenience -functions that create source values: - - -

4.2.2. The resolver API

A resolver is an object that can be opened like a file, but you -do not pass the file name to the resolver, but the XML identifier of the entity -to read from (either a SYSTEM or PUBLIC -clause). When opened, the resolver must return the -Lexing.lexbuf that reads the characters. The resolver can -be closed, and it can be cloned. Furthermore, it is possible to tell the -resolver which character set it should assume. - The following from Pxp_reader: - -

exception Not_competent
-exception Not_resolvable of exn
-
-class type resolver =
-  object
-    method init_rep_encoding : rep_encoding -> unit
-    method init_warner : collect_warnings -> unit
-    method rep_encoding : rep_encoding
-    method open_in : ext_id -> Lexing.lexbuf
-    method close_in : unit
-    method change_encoding : string -> unit
-    method clone : resolver
-    method close_all : unit
-  end
- -The resolver object must work as follows:

Exceptions. It is possible to chain resolvers such that when the first resolver is not able -to open the entity, the other resolvers of the chain are tried in turn. The -method open_in should raise the exception -Not_competent to indicate that the next resolver should try -to open the entity. If the resolver is able to handle the ID, but some other -error occurs, the exception Not_resolvable should be raised -to force that the chain breaks. -

Example: How to define a resolver that is equivalent to -from_string: ...

4.2.3. Predefined resolver components

There are some classes in Pxp_reader that define common resolver behaviour. - -

class resolve_read_this_channel : 
-    ?id:ext_id -> 
-    ?fixenc:encoding -> 
-    ?auto_close:bool -> 
-    in_channel -> 
-        resolver
- -Reads from the passed channel (it may be even a pipe). If the -~id argument is passed to the object, the created resolver -accepts only this ID. Otherwise all IDs are accepted. - Once the resolver has -been cloned, it does not accept any ID. This means that this resolver cannot -handle inner references to external entities. Note that you can combine this -resolver with another resolver that can handle inner references (such as -resolve_as_file); see class 'combine' below. - If you pass the -~fixenc argument, the encoding of the channel is set to the -passed value, regardless of any auto-recognition or any XML declaration. - If -~auto_close = true (which is the default), the channel is -closed after use. If ~auto_close = false, the channel is -left open. -

class resolve_read_any_channel : 
-    ?auto_close:bool -> 
-    channel_of_id:(ext_id -> (in_channel * encoding option)) -> 
-        resolver
- -This resolver calls the function ~channel_of_id to open a -new channel for the passed ext_id. This function must either -return the channel and the encoding, or it must fail with Not_competent. The -function must return None as encoding if the default -mechanism to recognize the encoding should be used. It must return -Some e if it is already known that the encoding of the -channel is e. If ~auto_close = true -(which is the default), the channel is closed after use. If -~auto_close = false, the channel is left open.

class resolve_read_url_channel :
-    ?base_url:Neturl.url ->
-    ?auto_close:bool -> 
-    url_of_id:(ext_id -> Neturl.url) -> 
-    channel_of_url:(Neturl.url -> (in_channel * encoding option)) -> 
-        resolver
- -When this resolver gets an ID to read from, it calls the function -~url_of_id to get the corresponding URL. This URL may be a -relative URL; however, a URL scheme must be used which contains a path. The -resolver converts the URL to an absolute URL if necessary. The second -function, ~channel_of_url, is fed with the absolute URL as -input. This function opens the resource to read from, and returns the channel -and the encoding of the resource.

Both functions, ~url_of_id and -~channel_of_url, can raise Not_competent to indicate that -the object is not able to read from the specified resource. However, there is a -difference: A Not_competent from ~url_of_id is left as it -is, but a Not_competent from ~channel_of_url is converted to -Not_resolvable. So only ~url_of_id decides which URLs are -accepted by the resolver and which not.

The function ~channel_of_url must return -None as encoding if the default mechanism to recognize the -encoding should be used. It must return Some e if it is -already known that the encoding of the channel is e.

If ~auto_close = true (which is the default), the channel is -closed after use. If ~auto_close = false, the channel is -left open.

Objects of this class contain a base URL relative to which relative URLs are -interpreted. When creating a new object, you can specify the base URL by -passing it as ~base_url argument. When an existing object is -cloned, the base URL of the clone is the URL of the original object. - Note -that the term "base URL" has a strict definition in RFC 1808.

class resolve_read_this_string : 
-    ?id:ext_id -> 
-    ?fixenc:encoding -> 
-    string -> 
-        resolver
- -Reads from the passed string. If the ~id argument is passed -to the object, the created resolver accepts only this ID. Otherwise all IDs are -accepted. - Once the resolver has been cloned, it does not accept any ID. This -means that this resolver cannot handle inner references to external -entities. Note that you can combine this resolver with another resolver that -can handle inner references (such as resolve_as_file); see class 'combine' -below. - If you pass the ~fixenc argument, the encoding of -the string is set to the passed value, regardless of any auto-recognition or -any XML declaration.

class resolve_read_any_string : 
-    string_of_id:(ext_id -> (string * encoding option)) -> 
-        resolver
- -This resolver calls the function ~string_of_id to get the -string for the passed ext_id. This function must either -return the string and the encoding, or it must fail with Not_competent. The -function must return None as encoding if the default -mechanism to recognize the encoding should be used. It must return -Some e if it is already known that the encoding of the -string is e.

class resolve_as_file :
-    ?file_prefix:[ `Not_recognized | `Allowed | `Required ] ->
-    ?host_prefix:[ `Not_recognized | `Allowed | `Required ] ->
-    ?system_encoding:encoding ->
-    ?url_of_id:(ext_id -> Neturl.url) -> 
-    ?channel_of_url: (Neturl.url -> (in_channel * encoding option)) ->
-    unit -> 
-        resolver
-Reads from the local file system. Every file name is interpreted as -file name of the local file system, and the referred file is read.

The full form of a file URL is: file://host/path, where -'host' specifies the host system where the file identified 'path' -resides. host = "" or host = "localhost" are accepted; other values -will raise Not_competent. The standard for file URLs is -defined in RFC 1738.

Option ~file_prefix: Specifies how the "file:" prefix of -file names is handled: -

Option ~host_prefix: Specifies how the "//host" phrase of -file names is handled: -

Option ~system_encoding: Specifies the encoding of file -names of the local file system. Default: UTF-8.

Options ~url_of_id, ~channel_of_url: Not -for the casual user!

class combine : 
-    ?prefer:resolver -> 
-    resolver list -> 
-        resolver
- -Combines several resolver objects. If a concrete entity with an -ext_id is to be opened, the combined resolver tries the -contained resolvers in turn until a resolver accepts opening the entity -(i.e. it does not raise Not_competent on open_in).

Clones: If the 'clone' method is invoked before 'open_in', all contained -resolvers are cloned separately and again combined. If the 'clone' method is -invoked after 'open_in' (i.e. while the resolver is open), additionally the -clone of the active resolver is flagged as being preferred, i.e. it is tried -first.


PrevHomeNext
Configuring and calling the parserUpThe DTD classes
\ No newline at end of file