This is the mail archive of the
docbook@lists.oasis-open.org
mailing list for the DocBook project.
Re: converting to docbook
- From: Bob Stayton <bobs at caldera dot com>
- To: jonathon <jblake at eskimo dot com>, docbook at lists dot oasis-open dot org
- Date: Tue, 13 Aug 2002 09:37:50 -0700
- Subject: Re: DOCBOOK: converting to docbook
- References: <Pine.GSU.4.44.0208130223420.4539-100000@eskimo.com>
On Tue, Aug 13, 2002 at 02:26:24AM -0700, jonathon wrote:
>
> All:
>
> I have roughly 10 000 documents of various formats
> [ plain ASCII, TeeX, DocBook, HTML 4.01, XHTML 1.0
> word, wordperfect, pdf and a couple of others. ]
>
> Can anybody point me to something that will easilly convert
> these to docbook, and preserve some/most of their current
> formatting?
>
> I'm not looking forward to doing the conversion manually.
If I had that problem, I would convert as many of them
as I could to HTML, run 'tidy' to clean up the HTML,
and then run the DocParse tool from www.commmandprompt.com to
convert them to DocBook. DocParse is not free, but it
is not expensive either.
For your PDF documents, I'd look for the source document
that generated the PDF. It is tough (impossible?)
to convert PDF.
--
Bob Stayton 400 Encinal Street
Publications Architect Santa Cruz, CA 95060
Technical Publications voice: (831) 427-7796
Caldera International, Inc. fax: (831) 429-1887
email: bobs@caldera.com