| Revision History | ||
|---|---|---|
| Revision v2.2.8 | 2003-07-16 | Revised by: es | 
| Added info about keeping JREs up-to-date forCocoon | ||
| Revision v2.2.7 | 2003-07-09 | Revised by: es | 
| Fixed broken links to LDP XSL and other LDP XSL spefic filenames. | ||
| Revision v2.2.6 | 2003-06-16 | Revised by: sa | 
| Verified the instruction against docbook XSL 1.57. | ||
| Revision v2.2.5 | 2003-05-16 | Revised by: sa | 
| Fixed the broken links in the external resources section. | ||
| Revision v2.2.4 | 2003-04-20 | Revised by: sa | 
| Updated links to the new Demo Site. Added new links to the resources section. | ||
| Revision v2.2.3 | 2002-11-22 | Revised by: sa | 
| Added the suggestion made by users. Added new links to the resources section. | ||
| Revision v2.2.2 | 2002-10-09 | Revised by: as | 
| This update fixes a few more typos, removes a couple of spaces that make the HTML rendering look odd. | ||
| Revision v2.2.1 | 2002-10-09 | Revised by: sa | 
| Fixed the URL to the Sample Files. | ||
| Revision v2.2 | 2002-09-29 | Revised by: as | 
| Minor corrections to the Cocoon section. | ||
| Revision v2.1 | 2002-09-15 | Revised by: sa | 
| Minor corrections to the Cocoon section. | ||
| Revision v2.0 | 2002-09-10 | Revised by: sa | 
| Added the section on serving DocBook XML 4.1.2 content using Tomcat + Cocoon. | ||
| Revision v1.5 | 2002-08-11 | Revised by: sa | 
| Added the XML section and the sample XML file. | ||
| Revision v1.4 | 2002-08-08 | Revised by: sa | 
| Many valuable modifications/corrections suggested by Lloyd D Budd. Thanks Lloyd. :) | ||
| Revision v1.3 | 2002-08-02 | Revised by: sa | 
| Added the "Additional Resources" section. | ||
| Revision v1.2 | 2002-07-23 | Revised by: sa | 
| Added the section on converting HTML -> PDF using HTMLDOC. Thanks to Luc De Louw for the suggestion. | ||
| Revision v1.1 | 2002-07-19 | Revised by: KET | 
| Fixed grammatical errors, numbered processes. | ||
| Revision v1.0 | 2002-06-29 | Revised by: sa | 
| Initial public release. | ||
Some Acronyms:
SGML - Standard Generalized Markup Language
XML - Extensible Markup Language
RTF - Rich Text Format
HTML - HyperText Markup Language
PDF - Portable Document Format
The objective of this document is to setup OpenJade to convert DocBook 3.2 and 4.2 Standard Generalized Markup Language (SGML) and Extensible Markup Language (XML) documents to HyperText Markup Language (HTML), Rich Text Format (RTF), and Portable Document Format (PDF).
This document is Copyright 2001 by Saqib Ali. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is available at http://www.gnu.org/copyleft/fdl.html.
All praise is due to Allah, The Lord of the Worlds. All credits go to Allah. Any mistake in this document is my own fault.
Additionally, I would like to acknowledge the following people for their valuable contributions to this document:
Eric Safern <esafern (at) lrn.com> - For updates related to Cocoon and JRE. http://www.timebytes.com/
Greg Ferguson <gferg (at) hoop.timonium.sgi.com> - for very helpful hints/suggestions on the docbook mailing list
Kristin Thomas <kristint (at) us.ibm.com> - For the initial review of this document.
Luc de Louw <luc (@) delouw.ch> - For suggesting the HTMLDOC (PDF -> HTML) section.
Lloyd D Budd <ldp (@) foolswisdom.org> - For suggestions on improving most of the sections of the document.
Andrew Shugg <andrew (@) neep.com.au> - For fixing errors in the ver 2.0 of this document. Neep Consulting
DocBook is a document type definition (DTD). A DTD defines the syntax of a document. DocBook describes the types of structure and formats to use in technical documents. It is commonly used because of its simplicity and completeness.
A DTD defines the syntax of a document - essentially it is a 'rule book' that describes the sets of tags and attributes that will be used to describe specific kinds of content. So DocBook is a "rule book" that is used for writing documents. Every tag that is used in writing the document, must be defined very specifically and formally in the DTD.
A Document Style Semantics and Specification Language (DSSSL) defines how to convert an Standard Generalized Markup Language (SGML) document into a human-readable viewing format such as HTML, RTF and PDF.
The tools needed to set up OpenJade for converting SGML and XML are:
OpenJade
DocBook DTDs
ISO Entities
Norman Walsh's DSSSL
LDP DSL
HTMLDOC (optional)
Norman Walsh's XSL (optional)
LDP XSL (optional)
|  | Note | 
|---|---|
| All of these packages are free and available for download on the net. The next chapter explains how to download these packages. | 
This document assumes that you have the following already installed on your system.
gzip - available from http://www.gnu.org/directory/
gcc and GNU make - available from http://www.gnu.org/directory/
unzip - available from http://www.info-zip.org/pub/infozip/
Standard Unix utilities - tar, mkdir, mv ...
You'll have to download and compile only one package (OpenJade). This HOWTO will explain the compilation process, but you should be familiar with installing from source code.
Most of the packages that we need are located at The Linux Documentation Project (TLDP) website.
Create a directory /tmp/downloads. We will use this directory to store the downloaded source code.
OpenJade will be used to process DocBook documents. OpenJade can be downloaded from: http://openjade.sourceforge.net/.
At the time of writing this document OpenJade 1.3.1 was available. Download the openjade-1.3.x.tar.gz file.
All the DocBook DTDs are available from The Linux Documentation Project website at http://www.tldp.org/authors/index.html#resources
Please download DocBook SGML v4.1, DocBook SGML v3.1, and DocBook XML v4.1.2.
|  | Note | 
|---|---|
| Please download all the zip archives. | 
The Linux Documentation Project has packaged all the Entities into one big tar file and placed it at http://www.tldp.org/authors/tools/entities.tar.gz for the convenience of the users. Thanks to TLDP for this.
Norman Walsh's DSSSL can be downloaded from the DocBook project website at http://sourceforge.net/project/showfiles.php?group_id=21935.
At the time of writing this document docbook-dsssl-1.7.6 was available.
LDP DSL is a customized style sheet used by The Linux Documentation Project (TLDP). It is an extension to Norman Walsh's DSSSL. It add things like background and Table of Contents. It can be downloaded from http://www.tldp.org/authors/tools/ldp.dsl.
ldp.dsl requires Normal Walsh's DSSSL
HTMLDOC can be used for converting the HTML to PDF. If you would like to produce PDF documents, please download HTMLDOC from http://www.easysw.com/htmldoc/software.php
This is not necessary. But if you would like to serve DocBook 4.1.2 XML content using Tomcat + Cocoon, you will need Norman Walsh's XML Style Sheets.
The Style Sheets are available for download at http://sourceforge.net/projects/docbook/.
Please download the package called docbook-xsl.
|  | Note | 
|---|---|
| Recently docbook-xsl ver. 1.57.0 was released. This document is verified with the latest version, and appropriate modifications have been made. If you still encounter any errors please email me @ <saqib@seagate.com> | 
Also download the LDP Customized XSL from http://my.core.com/~dhorton/docbook/tldp-xsl/
In this section we will install all the tools in the appropriate directories. All the tools go in the /usr/local/dbtools/ directory. Create this directory using the following command:
| # mkdir /usr/local/dbtools | 
This process is the easy part, but the most time consuming one too. Keep in mind that OpenJade take a long time to compile. To install OpenJade, complete the following steps:
Change directories to /tmp/downloads.
| # cd /tmp/downloads | 
Unzip the file.
| # gzip -d openjade-1.3.x.tar.gz | 
Untar the file.
| # tar -xvf openjade-1.3.x.tar | 
Change directories to openjade-1.3
| # cd openjade-1.3.x | 
Run the ./configure command.
| # ./configure --prefix=/usr/local/dbtools/openjade | 
Run the make command.
| # make | 
Run the make install command. After this step the OpenJade binaries will be installed under /usr/local/dbtools/openjade.
| # make install | 
Copy the dsssl directory from /tmp/downloads/openjade-1.3.x to /usr/local/dbtools/openjade
| # cp -dpR dsssl /usr/local/dbtools/openjade/ | 
In this step we will install Norman Walsh's DSSSL in an appropriate place. The DSSSL does not have to be compiled.
Change directories to /tmp/downloads
| # cd /tmp/downloads | 
Unzip the file.
| # gzip -d docbook-dsssl-1.76.tar.gz | 
Untar the file.
| # tar -xvf docbook-dsssl-1.76.tar | 
Move the file to the /usr/local/dbtools/docbook-dsssl.
| # mv docbook-dsssl-1.76 /usr/local/dbtools/docbook-dsssl | 
In this section we will install the DocBook DTDs.
Change directories to /usr/local/dbtools.
| # cd /usr/local/dbtools | 
Create three new directories called dtd3.1, dtd4.1, and dtd4.1.2.
| # mkdir dtd3.1 # mkdir dtd4.1 # mkdir dtd4.1.2 | 
Change directories to the dtd3.1.
| # cd dtd3.1 | 
Unzip the file DocBook SGML v3.1 in this directory.
| # unzip /tmp/downloads/docbk31.zip | 
Change directories to the dtd4.1.
| # cd ../dtd4.1 | 
Unzip the file DocBook SGML v4.1 in this directory.
| # unzip /tmp/downloads/docbk41.zip | 
Change directories to the dtd4.1.2.
| # cd ../dtd4.1.2 | 
Unzip the file DocBook XML v4.1.2 in this directory.
| # unzip /tmp/downloads/docbk412.zip | 
In this section we will install the ISO entities that we downloaded from the LDP website.
First we install the ISO Entities for the 3.1 SGML DTD.
Change directories to the /usr/local/dbtools/dtd3.1 directory.
| # cd /usr/local/dbtools/dtd3.1 | 
Copy /tmp/download/entities.tar.gz to this directory.
| # cp /tmp/download/entities.tar.gz . | 
Unzip the file.
| # gzip -d entities.tar.gz | 
Untar the file.
| # tar -xvf entities.tar | 
Next we install the ISO Entities for the 4.1 SGML DTD.
Change directories to the /usr/local/dbtools/dtd4.1 directory.
| # cd /usr/local/dbtools/dtd4.1 | 
Copy /tmp/download/entities.tar.gz to this directory.
| # cp /tmp/download/entities.tar.gz . | 
Unzip the file.
| # gzip -d entities.tar.gz | 
Untar the file.
| # tar -xvf entities.tar | 
Finally we install the customised LDP stylesheet.
Change directories to the /tmp/download directory.
| # cd /tmp/download | 
Copy the ldp.dsl file to the /usr/local/dbtools/docbook-dsssl/print/ldp.dsl directory.
| # cp ldp.dsl /usr/local/dbtools/docbook-dsssl/print/ldp.dsl | 
Copy the ldp.dsl file to the /usr/local/dbtools/docbook-dsssl/html/ldp.dsl directory.
| # cp ldp.dsl /usr/local/dbtools/docbook-dsssl/html/ldp.dsl | 
This step is optional. It is only required if you want to produce PDF documents from HTML.
Change back to the downloads directory.
| # Change to /tmp/download directory | 
Untar the source code for HTMLDOC.
| # gzip -d htmldoc-1.8.xx-source.tar.gz # tar -xvf htmldoc-1.8.xx-source.tar # cd htmldoc-1.8.xx-1 | 
Run configure to set the installation location.
| # ./configure --prefix=/usr/local/dbtools/htmldoc # make | 
At the time of writing this document HTMLDOC ver 1.8.20-1 was available. This version had a little problem in the fonts Makefile. It would complain while installing the fonts, because the correct fonts were not available on the system.
Here is the error you will get while running make install:
| # make install Making all in htmldoc... Making all in doc... Installing in fonts... Installing font files in /usr/local/dbtools/htmldoc/share/htmldoc/fonts... /bin/cp: cannot stat `ZapfChancery.afm': No such file or directory /bin/cp: cannot stat `ZapfChancery.pfa': No such file or directory /bin/cp: cannot stat `ZapfDingbats.afm': No such file or directory /bin/cp: cannot stat `ZapfDingbats.pfa': No such file or directory make[1]: *** [install] Error 1 | 
To fix this installation issue, please edit fonts/Makefile and comment out the lines with references to ZapfChancery and ZapfDingbats fonts.
Then execute the install:
| # make install Making all in htmldoc... Making all in doc... Installing in fonts... Installing font files in /usr/local/dbtools/htmldoc/share/htmldoc/fonts... Installing in data... Installing in doc... Installing in htmldoc... | 
In this section we will use OpenJade to convert DocBook SGML/XML documents to HTML, RTF, and PDF.
The SGML_CATALOG_FILES variable must be set to point to appropriate catalog files. To set the variable, use the following command for the Bourne shell:
| # export SGML_CATALOG_FILES=/usr/local/dbtools/openjade/dsssl/catalog:/usr/local/dbtools/dtd3.1/docbook.cat:/usr/local/dbtools/docbook-dsssl/catalog | 
Use the following command for the C shell:
| # setenv SGML_CATALOG_FILES /usr/local/dbtools/openjade/dsssl/catalog:/usr/local/dbtools/dtd3.1/docbook.cat:/usr/local/dbtools/docbook-dsssl/catalog | 
To convert from SGML to HTML, use the following command:
| # /usr/local/dbtools/openjade/bin/openjade -t sgml -d /usr/local/dbtools/docbook-dsssl/html/ldp.dsl#html DocBook-OpenJade-SGML-XML-HOWTO.sgml | 
To create a non-chunked (all in one) output:
| # /usr/local/dbtools/openjade/bin/openjade -V nochunks -t sgml -d /usr/local/dbtools/docbook-dsssl/html/ldp.dsl#html DocBook-OpenJade-SGML-XML-HOWTO.sgml | 
You can download a sample DocBook 4.1.2 XML file from http://www.xml-dev.com:8080/cocoon/mount/docbook/openjade.xml
The SGML_CATALOG_FILES variable must be set to point to appropriate catalog files. To set the variable, use the following command for the Bourne shell:
| # export SGML_CATALOG_FILES=/usr/local/dbtools/openjade/dsssl/catalog:/usr/local/dbtools/dtd4.1.2/docbook.cat:/usr/local/dbtools/docbook-dsssl/catalog | 
Use the following command for the C shell:
| # setenv SGML_CATALOG_FILES /usr/local/dbtools/openjade/dsssl/catalog:/usr/local/dbtools/dtd4.1.2/docbook.cat:/usr/local/dbtools/docbook-dsssl/catalog | 
To convert HTML to PDF we must use HTMLDOC. First create non-chunked HTML output of the SGML:
| # /usr/local/dbtools/openjade/bin/openjade -V nochunks -t sgml -d /usr/local/dbtools/docbook-dsssl/html/ldp.dsl#html DocBook-OpenJade-SGML-XML-HOWTO.sgml | 
Then run HTMLDOC to produce PDF.
| # /usr/local/dbtools/htmldoc/bin/htmldoc -f outfile.pdf input.html | 
There are 3 ways to serve DocBook 4.1.2 XML from a web server:
Command line Pre-processed Open Jade, XSLT
Scripting - PHP, Perl, Python
Application server - Tomcat + Cocoon
Using an application server like Cocoon is the best the option.
|  | Cocoon in Action | 
|---|---|
| To see an example of web server running Tomcat + Cocoon serving DocBook 4.1.2 XML content, please visit http://www.xml-dev.com:8080/cocoon/mount/docbook/ | 
In this section we will see how to serve DocBook 4.1.2 XML content using Tomcat + Cocoon.
Tomcat is the Java Servlet Container. For more information please visit http://jakarta.apache.org/tomcat/index.html.
Apache Cocoon is an XML publishing framework. For more information please visit http://xml.apache.org/cocoon/index.html.
This HOWTO will not go into details of setting up Tomcat + Cocoon, since it is already explained in the document http://xml.apache.org/cocoon/installing/index.html. Setting up Tomcat + Cocoon is an easy process and should take less than five minutes.
Once you have the Cocoon + Tomcat setup and working, please follow the next the sections to server DocBook 4.1.2 XML content.
|  | One important caveat: users in the field have experienced compatibility issues with the DocBook stylesheets and some versions of the Xalan XML parser. Xalan is the parser bundled with Sun's JRE, so that's what you're using by default. | 
At the very least, make sure you're using the latest JRE from Sun (at this writing, 1.4.2).
Also consider upgrading the Xalan parser to the latest release. At this writing, the latest Sun JRE, 1.4.2, is bundled with Xalan 2.4.1, while Xalan itself is up to version 2.5.1.
To check the version currently installed, type
| # java org.apache.xalan.xslt.EnvironmentCheck | 
For more info, visit http://xml.apache.org/xalan-j/faq.html .
In this step we will install the Norman Walsh's XSL under the /usr/local/dbtools/ directory.
Change to the /tmp/downloads directory and untar the docbook-xsl file.
| # cd /tmp/downloads/ | 
| # gzip -d docbook-xsl-1.53.0.tar.gz | 
| # tar -xvf docbook-xsl-1.53.0.tar | 
To install the docbook-xsl please move the files to the /usr/local/dbtools.
| # mv docbook-xsl-1.53.0 /usr/local/dbtool/docbook-xsl | 
Next install the LDP XSL.
Unzip the tldp-xsl-xxxxx.tar.gz and the copy all the files to the /usr/local/dbtools/docbook-xsl/html directory.
| # cd /tmp/downloads | 
| # gzip tldp-xsl-xxxxx.tar.gz | 
| # gzip tldp-xsl-xxxxx.tar | 
| # mv tldp-html*.xsl /usr/local/dbtools/docbook-xsl/html | 
$COCOON_HOME points to the Cocoon Web Application Directory. This directory is typically /usr/local/jakarta-tomcat-4.1.9/webapps/cocoon/
Create a directory named docbook under the $COCOON_HOME/mount. This is where we will put all our DocBook XML 4.1.2 content.
| # mkdir $COCOON_HOME/mount/docbook | 
Create a file name sitemap.xmap in the $COCOON_HOME/mount/docbook with the following content:
| # cd $COCOON_HOME/mount/docbook | 
| # vi sitemap.xmap | 
| 
<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0">
    <!-- use the standard components -->
    <map:components>
        <map:generators default="file"/>
        <map:transformers default="xslt"/>
        <map:readers default="resource"/>
        <map:serializers default="html"/>
        <map:selectors default="browser"/>
        <map:matchers default="wildcard"/>
        <map:transformers default="xslt"/>
    </map:components>
      
    <map:pipelines>
        <map:pipeline>
   <map:match pattern="">
    <map:generate src="samples.xml"/>
    <map:transform src="/usr/local/jakarta-tomcat-4.1.9/webapps/cocoon/mount/editor/stylesheets/simple-page2html.xsl"/>
    <map:serialize/>
   </map:match>
            <!-- respond to *.html requests with 
                 our docs processed by .xsl -->
            <map:match pattern="*.html">
                <map:generate src="{1}.xml"/>
                <map:transform src="/usr/local/dbtools/docbook-xsl/html/tldp-html.xsl"/>
                <map:serialize type="html"/>
            </map:match>
            
            <!-- later, respond to *.pdf requests with 
                 our docs processed by doc2pdf.xsl -->
            <map:match pattern="*.pdf">
                <map:generate src="{1}.xml"/>
                <map:transform src="/usr/local/dbtools/docbook-xsl/fo/docbook.xsl"/>
                <map:serialize type="fo2pdf"/>
            </map:match>
            <map:match pattern="*.xml">
                <map:generate src="{1}.xml"/>
                <map:serialize type="xml"/>
            </map:match>
        </map:pipeline>
    </map:pipelines>
</map:sitemap>
 | 
Place a DocBook 4.1.2 XML file in the $COCOON_HOME/mount/docbook/ directory.
A sample file is available from http://www.xml-dev.com:8080/cocoon/mount/docbook/openjade.xml.
Now you can access the document using a browser at http://localhost:8080/cocoon/mount/sample.html (HTML) or http://localhost:8080/cocoon/mount/sample.pdf (PDF).
This section has some pointers to related resources on the Internet.
If you would like to suggest additional resources for this section, please email me at <saqib@seagate.com>. Thanks.
Some of the news groups of interest are:
comp.text.sgml (easily accessible from Google! Groups)
comp.text.xml (easily accessible from Google! Groups)
htmldoc.general (server - news.easysw.com)
Here are some relevant mailing lists.
DocBook mailing list @ OASIS. Visit http://www.oasis-open.org/committees/docbook/mailinglist/index.shtml for more info.
DocBook mailing list @ TLDP. Visit http://www.tldp.org/mailinfo.html for more info.
xml-doc @ Yahoo Groups. Visit http://groups.yahoo.com/group/xml-doc/ for more info.
http://www.oasis-open.org/ OASIS maintains various DocBook DTDs
http://www.xml-dev.com/blog/ XML / XHTML WebLog
http://docbook.org/wiki/moin.cgi/ The DocBook Wiki
http://www.docbook.org/tdg/en/ Online version of DocBook: The Definitive Guide
http://www.bureau-cornavin.com/opensource/crash-course/index.html Writing Documentation Using DocBook: A Crash Course
http://www-106.ibm.com/developerworks/library/l-docbk.html A gentle guide to DocBook (very good introduction).
http://www.tldp.org/LDP/LDP-Author-Guide/index.html The Linux Documentation Project (TLDP) Author Guide
http://www.tldp.org/authors/index.html#resources DocBook resources provided by TLDP
http://www.tldp.org/HOWTO/DocBook-Demystification-HOWTO/ Eric Raymond's DocBook Demystification HOWTO
http://www.xml-dev.com:8080/cocoon/mount/docbook/Tomcat + Cocoon + DocBook Setup Sample Site
|  | Note | 
|---|---|
| A comprehensive list of XML editors can be found at http://www.xml-dev.com/blog/#19 | 
eXchaNGeR - The XML Browser (and XML Editor)http://xngr.org/
XERLIN - XML Modeling Applicationhttp://www.xerlin.org/
DocPro by Command Prompt, INC. http://www.commandprompt.com/entry.lxp?lxpe=2
YAWC Pro by XML Workshop LTD. http://www.yawcpro.com/. Can be used for converting MS Word to Simple DocBook XML.
Logictran RTF Converter. http://www.logictran.com/. Word/RTF to HTML/XML.
MajiX - Word to XML converter. http://tetrasys.dhs.org/
XMETAL by SoftQuad http://www.softquad.com/
Tagless Editor by i4i (DocBook DTD not supported) http://www.i4i.com/
XML Editor by XMLmind http://www.xmlmind.com/xmleditor/
upCast and downCast by Inifinity Loop http://www.infinity-loop.de/en/products.html
W2XML by DocSofthttp://www.docsoft.com/w2xmlv2.htm
XMLWrite by Wattle Softwarehttp://xmlwriter.net/
oXygen XML Editor - Java Basedhttp://www.oxygenxml.com/
Xeena by IBMhttp://www.alphaworks.ibm.com/tech/xeena
Excosoft XML Clienthttp://www.excosoft.se/eweb/site/exc_pd.html
Timelux Xpresshttp://www.timelux.lu/html/Xpress2001.html
Morphonhttp://www.morphon.com/
Conglomeratehttp://conglomerate.org/