This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Re-post: typographical characters & encoding


Ok, some progress.

The issue appears to be "special" apostrophes that were an artifact of the 
text having started out life in MS Word. XMetaL doesn't seem to do much 
about sorting these characters out or replacing them with "correct" 
apostrophes. Nor does it seem very good at finding and replacing these 
characters, or distinguishing among them.

Once the MS apostrophes are gone, PassiveTeX does an admirable job of 
rendering to PDF as does FOP (provided you don't have much/any table markup 
in your source). Specifically the bullets work properly and all other 
typographical characters appear to be correct.

Unfortunately, the IBM FOComposer software still doesn't deal with the 
bullet points, or em dashes for that matter. Perhaps next release.

Why is it that these curly apostrophes are not flagged in some way by the 
various validators and/or procesors? XMetaL, Saxon, Xalan/Xerxes? 
PassiveTeX was the only package that flagged these characters as anomalies.

                     ...edN

At 01:31 PM 24/10/2001 -0400, Ed Nixon wrote:

>>My PDF output contains incorrect characters for typographical symbols 
>>like the bullet, em-dash, and double quotes.
>
>Version: docbook-xsl-1.41
>
>I haven't made any progress with this problem on my own since yesterday 
>and I haven't seen any response from the list. Is there no one who can 
>take a minute to give me some direction?
>
>The XML file is encoded in UTF-8 and today in iso-8859-1. The FO file has 
>been encoded alternatively in UTF-8 and UTF-16.
>
>Looking at the FO source file, the bullet actually displays. However...
>
>After generating the PDF from IBM PDFComposer, the bullet is replaced with 
>the number 42; out of PassiveTeX, vertical double quotes appear instead of 
>bullets.
>
>PassiveTeX is polite enough to also alert me to the fact that it doesn't 
>understand my apostrophes; there are no to screen diagnostics from the IBM 
>processor.
>
>One thing that is confusing to me is the fact that some of the problem 
>characters are generated by the stylesheets, i.e., the bullets. I can see 
>that corrupt characters might end up in my source file given the Win2K 
>environment and XMetaL but I don't see how the stylesheet generated 
>characters could be problematical. And I have to confess, I don't 
>understand why the encoding parameter isn't enough to sort everything out 
>like magic.
>
>I'm happy to do the legwork if someone could simply point me to some 
>appropriate information.
>
>
>>Thanks for your help.                  ...edN
>
>
>
>
>----------------------------------------------------------------
>To subscribe or unsubscribe from this elist use the subscription
>manager: <http://lists.oasis-open.org/ob/adm.pl>




----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.oasis-open.org/ob/adm.pl>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]