4 >A complete example: The readme DTD</TITLE
7 CONTENT="Modular DocBook HTML Stylesheet Version 1.46"><LINK
9 TITLE="The PXP user's guide"
10 HREF="index.html"><LINK
15 TITLE="Highlights of XML"
16 HREF="x107.html"><LINK
19 HREF="c533.html"><LINK
22 HREF="markup.css"></HEAD
41 >The PXP user's guide</TH
56 >Chapter 1. What is XML?</TD
75 NAME="SECT.README.DTD"
76 >1.3. A complete example: The <I
85 > was that I often wrote two versions
86 of files such as README and INSTALL which explain aspects of a distributed
87 software archive; one version was ASCII-formatted, the other was written in
88 HTML. Maintaining both versions means double amount of work, and changes
89 of one version may be forgotten in the other version. To improve this situation
93 > DTD which allows me to maintain only
94 one source written as XML document, and to generate the ASCII and the HTML
97 >In this section, I explain only the DTD. The <I
101 contained in the <SPAN
104 > distribution together with the two converters to
105 produce ASCII and HTML. Another <A
108 > of this manual describes the HTML
111 >The documents have a simple structure: There are up to three levels of nested
112 sections, paragraphs, item lists, footnotes, hyperlinks, and text emphasis. The
113 outermost element has usually the type <TT
120 CLASS="PROGRAMLISTING"
121 ><!ELEMENT readme (sect1+)>
123 title CDATA #REQUIRED></PRE
126 This means that this element contains one or more sections of the first level
130 >), and that the element has a required
134 > containing character data (CDATA). Note that
138 > elements must not contain text data.</P
140 >The three levels of sections are declared as follows:
143 CLASS="PROGRAMLISTING"
144 ><!ELEMENT sect1 (title,(sect2|p|ul)+)>
146 <!ELEMENT sect2 (title,(sect3|p|ul)+)>
148 <!ELEMENT sect3 (title,(p|ul)+)></PRE
151 Every section has a <TT
154 > element as first subelement. After
155 the title an arbitrary but non-empty sequence of inner sections, paragraphs and
156 item lists follows. Note that the inner sections must belong to the next higher
160 > elements must not contain inner
161 sections because there is no next higher level.</P
163 >Obviously, all three declarations allow paragraphs (<TT
170 >). The definition can be simplified at this
171 point by using a parameter entity:
174 CLASS="PROGRAMLISTING"
175 ><!ENTITY % p.like "p|ul">
177 <!ELEMENT sect1 (title,(sect2|%p.like;)+)>
179 <!ELEMENT sect2 (title,(sect3|%p.like;)+)>
181 <!ELEMENT sect3 (title,(%p.like;)+)></PRE
187 > is nothing but a macro abbreviating
188 the same sequence of declarations; if new elements on the same level as
195 > are later added, it is
196 sufficient only to change the entity definition. Note that there are some
197 restrictions on the usage of entities in this context; most important, entities
198 containing a left paranthesis must also contain the corresponding right
201 >Note that the entity <TT
208 > entity, i.e. the ENTITY declaration contains a
209 percent sign, and the entity is referred to by
213 >. This kind of entity must be used to abbreviate
214 parts of the DTD; the <I
217 > entities declared without
218 percent sign and referred to as <TT
227 > element specifies the title of the section in
228 which it occurs. The title is given as character data, optionally interspersed
229 with line breaks (<TT
235 CLASS="PROGRAMLISTING"
236 ><!ELEMENT title (#PCDATA|br)*></PRE
239 Compared with the <TT
249 > element, this element allows inner markup
253 >) while attribute values do not: It is an error if
254 an attribute value contains the left angle bracket < literally such that it
255 is impossible to include inner elements. </P
257 >The paragraph element <TT
260 > has a structure similar to
264 >, but it allows more inner elements:
267 CLASS="PROGRAMLISTING"
268 ><!ENTITY % text "br|code|em|footnote|a">
270 <!ELEMENT p (#PCDATA|%text;)*></PRE
273 Line breaks do not have inner structure, so they are declared as being empty:
276 CLASS="PROGRAMLISTING"
277 ><!ELEMENT br EMPTY></PRE
280 This means that really nothing is allowed within <TT
284 must always write <TT
286 ><br></br></TT
293 >Code samples should be marked up by the <TT
297 text can be indicated by <TT
303 CLASS="PROGRAMLISTING"
304 ><!ELEMENT code (#PCDATA)>
306 <!ELEMENT em (#PCDATA|%text;)*></PRE
312 > elements are not allowed to contain further markup
316 > elements do is a design decision by the author of
319 >Unordered lists simply consists of one or more list items, and a list item may
320 contain paragraph-level material:
323 CLASS="PROGRAMLISTING"
324 ><!ELEMENT ul (li+)>
326 <!ELEMENT li (%p.like;)*></PRE
329 Footnotes are described by the text of the note; this text may contain
330 text-level markup. There is no mechanism to describe the numbering scheme of
331 footnotes, or to specify how footnote references are printed.
334 CLASS="PROGRAMLISTING"
335 ><!ELEMENT footnote (#PCDATA|%text;)*></PRE
338 Hyperlinks are written as in HTML. The anchor tag contains the text describing
339 where the link points to, and the <TT
343 pointer (as URL). There is no way to describe locations of "hash marks". If the
344 link refers to another <I
347 > document, the attribute
351 > should be used instead of <TT
355 The reason is that the converted document has usually a different system
356 identifier (file name), and the link to a converted document must be
360 CLASS="PROGRAMLISTING"
361 ><!ELEMENT a (#PCDATA)*>
364 readmeref CDATA #IMPLIED
368 Note that although it is only sensible to specify one of the two attributes,
369 the DTD has no means to express this restriction.</P
371 >So far the DTD. Finally, here is a document for it:
374 CLASS="PROGRAMLISTING"
375 ><?xml version="1.0" encoding="ISO-8859-1"?>
376 <!DOCTYPE readme SYSTEM "readme.dtd">
377 <readme title="How to use the readme converters">
379 <title>Usage</title>
381 The <em>readme</em> converter is invoked on the command line by:
384 <code>readme [ -text | -html ] input.xml</code>
387 Here a list of options:
391 <p><code>-text</code>: specifies that ASCII output should be produced</p>
394 <p><code>-html</code>: specifies that HTML output should be produced</p>
398 The input file must be given on the command line. The converted output is
399 printed to <em>stdout</em>.
403 <title>Author</title>
405 The program has been written by
406 <a href="mailto:Gerd.Stolpmann@darmstadt.netsurf.de">Gerd Stolpmann</a>.
409 </readme></PRE
452 >Highlights of XML</TD