This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
There is this beta "SaveAsXML" plug-in for Adobe Acrobat 5 that has customizable mapping tables. If you produce a _tagged_ PDF (e.g. product of InDesign, MS Office PDFmaker (win only)), SaveAsXML will convert it to a generic XML doc mapping block styles to XML tags. If you use DocBook tag names for your blocks, then you will get a primitive form of DocBook, or what I'm calling preDocBook. I customized the supplied XML table to pick up Bold and convert it to <emphasis> and Italic to <citetitle>. I've attached this table if anyone want to try it out.On Tue, Aug 13, 2002 at 02:26:24AM -0700, jonathon wrote: >I have roughly 10 000 documents of various formats [ plain ASCII, TeeX, DocBook, HTML 4.01, XHTML 1.0 word, wordperfect, pdf and a couple of others. ] Can anybody point me to something that will easilly convert these to docbook, and preserve some/most of their current> formatting? [snip] For your PDF documents, I'd look for the source document that generated the PDF. It is tough (impossible?) to convert PDF.
Attachment:
XML-1-00preDocBook.xml
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |