This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Is XML/XSL the right direction and if so where next.
- To: xsl-list at lists dot mulberrytech dot com
- Subject: [xsl] Is XML/XSL the right direction and if so where next.
- From: Matt Gushee <mgushee at havenrock dot com>
- Date: Wed, 14 Mar 2001 11:12:13 -0700 (MST)
- Cc: XSL-List at lists dot mulberrytech dot com
- References: <PM.28399.984589337@pmweb5.uk1.bibliotech.net>
- Reply-To: xsl-list at lists dot mulberrytech dot com
Sara Christie writes:
> Can I have an XML file which can be administered from MS Word (that
> being the tool of choice of the administrator) but which is also used
> with stylesheets (XSL) to display the content on the web. Any helpful
> suggestions on how to achieve this and alternative methods would be
> greatly appreciated. I am very keen to not re-invent the wheel nor to
> attempt the impossible.
Do you have the resources to create a custom Visual Basic application?
If so, recent versions of Word (since 97, I guess) provide a VB object
model that allows you to manipulate everything in a document. So you
could use this object model together with MSXML (be sure to get the
most recent version from msdn.microsoft.com/xml/) to:
1) Output a "dumb" XML representation of the object tree
(e.g., a VB Paragraph would become a <Paragraph></Paragraph>)
2) Use XSLT to transform the above to "smart" XML (i.e., XML that
describes the structure and semantics of your data).
Then of course you would need to be able to reverse the process to
output Word docs from XML.
I've tried this with a very simple PowerPoint presentation. It was
surprisingly easy. Word would be a bit more challenging, since Word
documents can include just about anything, and their high-level
structure is much less predictable than PowerPoint's.
There are a couple of problems with this approach:
* Is there a need to store formatting information?
As you may have realized, the Microsoft HTML format that you didn't
like is designed to do exactly that -- i.e., it is designed to
produce a document that can be opened on anyone's desktop anywhere
(or in Internet Explorer) and look exactly like it did to the person
who created it.
If your organization uses standardized templates and styles (or in
the unlikely event that noone cares how the output documents are
formatted), then you should be able to strip out all the formatting
and store only the content.
If formatting needs to be preserved on a per-document basis, then
you're probably better off using Microsoft's HTML/XML output, ugly
and convoluted though it is. You could run the resulting HTML files
through HTML Tidy to make them into XHTML, then run an XSLT
transform to create something more sensible from the XHTML.
* How do you deal with graphics, especially embedded bitmap graphics?
* How do you enforce the workflow:
Word -> dumb XML -> smart XML -> dumb XML -> Word
... i.e., how do you ensure that all data is stored in smart XML,
and that the dumb XML and Word documents -- especially the latter
-- are transient, used only for user input and output?
Matt Gushee
Englewood, CO, USA
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list