****************************************************************************** README - Netstring, string processing functions for the net ****************************************************************************** ============================================================================== Abstract ============================================================================== Netstring is a collection of string processing functions that are useful in conjunction with Internet messages and protocols. In particular, it contains functions for the following purposes: - Parsing MIME messages - Several encoding/decoding functions (Base 64, Quoted Printable, Q, URL-encoding) - A new implementation of the CGI interface that allows users to upload files - A simple HTML parser - URL parsing, printing and processing - Conversion between character sets ============================================================================== Download ============================================================================== You can download Netstring as gzip'ed tarball [1]. ============================================================================== Documentation ============================================================================== Sorry, there is no manual. The mli files describe each function in detail. Furthermore, the following additional information may be useful. ------------------------------------------------------------------------------ New CGI implementation ------------------------------------------------------------------------------ For a long time, the CGI implementation by Jean-Christophe Filliatre has been the only freely available module that implemented the CGI interface (it also based on code by Daniel de Rauglaudre). It worked well, but it did not support file uploads because this requires a parser for MIME messages. The main goal of Netstring is to realize such uploads, and because of this it contains an almost complete parser for MIME messages. The new CGI implementation provides the same functions than the old one, and some extensions. If you call Cgi.parse_args(), you get the CGI parameters as before, but as already explained this works also if the parameters are encaspulated as MIME message. In the HTML code, you can select the MIME format by using
...
- this "enctype" attribute forces the browser to send the form parameters as multipart MIME message (Note: You can neither send the parameters of a conventional hyperlink as MIME message nor the form parameters if the "method" is "get"). In many browsers only this particular encoding enables the file upload elements, you cannot perform file uploads with other encodings. As MIME messages can transport MIME types, filename, and other additional properties, it is also possible to get these using the enhanced interface. After calling Cgi.parse_arguments config you can get all available information about a certain parameter by invoking let param = Cgi.argument "name" - where "param" has the type "argument". There are several accessor functions to extract the various aspects of arguments (name, filename, value by string, value by temporary file, MIME type, MIME header) from "argument" values. ------------------------------------------------------------------------------ Base64, and other encodings ------------------------------------------------------------------------------ Netstring is also the successor of the Base64 package. It provides a Base64 compatible interface, and an enhanced API. The latter is contained in the Netencoding module which also offers implementations of the "quoted printable", "Q", and "URL" encodings. Please see netencoding.mli for details. ------------------------------------------------------------------------------ The MIME scanner functions ------------------------------------------------------------------------------ In the Mimestring module you can find several functions scanning parts of MIME messages. These functions already cover most aspects of MIME messages: Scanning of headers, analysis of structured header entries, and scanning of multipart bodies. Of course, a full-featured MIME scanner would require some more functions, especially concrete parsers for frequent structures (mail addresses or date strings). Please see the file mimestring.mli for details. ------------------------------------------------------------------------------ The HTML parser ------------------------------------------------------------------------------ The HTML parser should be able to read every HTML file; whether it is correct or not. The parser tries to recover from parsing errors as much as possible. The parser returns the HTML term as conventional recursive value (i.e. no object-oriented design). The parser depends a bit on knowledge about the HTML version; mainly because it needs to know the tags that are always empty. It may be necessary that you must adjust this configuration before the parser works well enough for your purpose. Please see the Nethtml module for details. ------------------------------------------------------------------------------ The abstract data type URL ------------------------------------------------------------------------------ The module Neturl contains support for URL parsing and processing. The implementation follows strictly the standards RFC 1738 and RFC 1808. URLs can be parsed, and several accessor functions allow the user to get components of parsed URLs, or to change components. Modifying URLs is safe; it is impossible to create a URL that does not have a valid string representation. Both absolute and relative URLs are supported. It is possible to apply a relative URL to a base URL in order to get the corresponding absolute URL. ------------------------------------------------------------------------------ Conversion between character sets and encodings ------------------------------------------------------------------------------ The module Netconversion converts strings from one characters set to another. It is Unicode-based, and there are conversion tables for more than 50 encodings. ============================================================================== Author, Copying ============================================================================== Netstring has been written by Gerd Stolpmann [2]. You may copy it as you like, you may use it even for commercial purposes as long as the license conditions are respected, see the file LICENSE coming with the distribution. It allows almost everything. ============================================================================== History ============================================================================== - Changed in 0.9.3: Fixed a bug in the "install" rule of the Makefile. - Changed in 0.9.2: New format for the conversion tables which are now much smaller. - Changed in 0.9.1: Updated the Makefile such that (native-code) compilation of netmappings.ml becomes possible. - Changed in 0.9: Extended Mimestring module: It can now process RFC-2047 messages. New Netconversion module which converts strings between character encodings. - Changed in 0.8.1: Added the component url_accepts_8bits to Neturl.url_syntax. This helps processing URLs which intentionally contain bytes >= 0x80. Fixed a bug: Every URL containing a 'j' was malformed! - Changed in 0.8: Added the module Neturl which provides the abstract data types of URLs. The whole package is now thread-safe. Added printers for the various opaque data types. Added labels to function arguments where appropriate. The following functions changed their signatures significantly: Cgi.mk_memory_arg, Cgi.mk_file_arg. - Changed in 0.7: Added workarounds for frequent browser bugs. Some functions take now an additional argument specifying which workarounds are enabled. - Changed in 0.6.1: Updated URLs in documentation. - Changed in 0.6: The file upload has been re-implemented to support large files; the file is now read block by block and the blocks can be collected either in memory or in a temporary file. Furthermore, the CGI API has been revised. There is now an opaque data type "argument" that hides all implementation details and that is extensible (if necessary, it is possible to add features without breaking the interface again). The CGI argument parser can be configured; currently it is possible to limit the size of uploaded data, to control by which method arguments are processed, and to set up where temporary files are created. The other parts of the package that have nothing to do with CGI remain unchanged. - Changed in 0.5.1: A mistake in the documentation has been corrected. - Initial version 0.5: The Netstring package wants to be the successor of the Base64-0.2 and the Cgi-0.3 packages. The sum of both numbers is 0.5, and because of this, the first version number is 0.5. -------------------------- [1] see http://www.ocaml-programming.de/packages/netstring-0.9.2.tar.gz [2] see mailto:gerd@gerd-stolpmann.de