This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Choosing a characterset for DocBook


On Fri, 15 Mar 2002, Jirka Kosek wrote:

> Jens Stavnstrup wrote:
> 
> > > If your documents will contain a lot of character outside of ISO Latin 1
> > > or ASCII using UTF-8 is best choice, assuming that all editors used can
> > > deal with UTF-8.
> > >
> > 
> > Not really, the problem is basically, that Word, which might be used to
> > to edit the XML sources, kindly add invinsible characters to my document.
> > And this might cause problems for my colleagues.
> 
> Even if you save it as plain text? If you will use ISO-8859-1 there
> shouldn't be added any additional characters. Problem may be with UTF-8
> usage. MS applications adds byte-order mark to the beginning of UTF-8
> files. This is not supported some older XML parsers, as it was not
> required by XML spec. I think that Second edition of XML 1.0 solved this
> issue.


What is plain text ?

I looked at Word 97. There you can save it as Unicode (but it doesn't 
state, what encodig is used)


In Word 2000, you can save as "Encoded Text", and actually specify which 
encoding UTF-8, ISO-8859-1 or quite a few others (very nice !).

Regards


Jens






Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]