This is the mail archive of the
docbook-apps@lists.oasis-open.org
mailing list .
Re: Converting from DocBook/SGML to DocBook/XML
- To: rsharpe at ns dot aus dot com
- Subject: Re: DOCBOOK-APPS: Converting from DocBook/SGML to DocBook/XML
- From: Dan York <dyork at e-smith dot com>
- Date: Thu, 12 Jul 2001 09:49:44 -0400
- Cc: docbook-apps at lists dot oasis-open dot org
- Organization: e-smith, Inc.
- References: <3B4D43BB.600@ns.aus.com>
Richard,
> What is involved in converting from DocBook/SGML to DocBook/XML? Just
> the header, or is there more serious work?
I've done a good bit of this and found it relatively trivial. In fact,
now that I have my user manual with the appropriate markup, I can leave it as
SGML to use the standard processing I've done, and just change the header
so it becomes XML and experiment with using the XML processing tools.
(And I have a Makefile that automates this - "make xml" creates a ".xml"
copy of my SGML file with the appropriate XML header.)
Here's what I have had to do for each file. (NOTE: I have *always*
typed all my DocBook tags in lowercase, and I have never used minimization
things like "</>" for end tags, so neither of those issues affect me.
If you did use uppercase or minimization tags, you'll have to change
those, too.):
1. Change the header from:
<!DOCTYPE BOOK PUBLIC "-//OASIS//DTD DocBook V4.1//EN">
to:
<?xml version="1.0"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
2. Go through and verify that all attribute values are quoted. For
instance, I often did this:
<tgroup cols=2>
which needs to be changed to:
<tgroup cols="2">
3. Make sure that any empty tags have an end tag. Two that I found
I used a lot were <colspec> and <imagedata>. So the SGML tag:
<imagedata fileref="foo.eps" format="EPS">
needs to become:
<imagedata fileref="foo.eps" format="EPS"/>
Note that some people prefer to actually put in an end tag:
<imagedata fileref="foo.eps" format="EPS"></imagedata>
I just preferred to use "<tag/>" because it was simpler. (And it
did not at all affect the SGML processing to have the / in there.)
That was it. Like others have said, with some search and replace
it's a relatively quick process.
Regards,
Dan
--
Dan York, Director of Training dyork@e-smith.com
Ph: +1-613-751-4401 Mobile: +1-613-263-4312 Fax: +1-613-564-7739
e-smith, inc. 150 Metcalfe St., Suite 1500, Ottawa,ON K2P 1P1 Canada
http://www.e-smith.com/ open source, open mind