This is the mail archive of the xsl-list@mulberrytech.com mailing list .

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Post-Processing PDF For Back-Of-The-Book Indexes

From: "W. Eliot Kimber" <eliot at isogen dot com>
To: XSL List <xsl-list at lists dot mulberrytech dot com>
Date: Sun, 10 Feb 2002 09:20:32 -0600
Subject: [xsl] Post-Processing PDF For Back-Of-The-Book Indexes
Organization: DataChannel, Inc
Reply-to: xsl-list at lists dot mulberrytech dot com

In reference to an earlier thread about eliminating duplicate page
numbers in back-of-the-book indexes generated by XSL-FO styles, I have
successfully done this using the free PJ library from www.Etymon.com.
With this library you can interact with PDF at the lowest level of
granularity (individual PDF operators within a page). In my case, I was
able to get to the individual lines of the index pages, find sequences
of repeated numbers, remove them from the document, and write a new PDF
document. It required about 150 lines of Python (using the Jython
interpreter to provide access to the PJ Java library) to implement the
initial functionality I needed.

I'm not quite ready to post code--I need to refine what I've written and
do more testing, but I wanted to report this initial success as I know
others are struggling with this same problem.

Cheers,

Eliot
-- 
W. Eliot Kimber, eliot@isogen.com
Consultant, ISOGEN International

1016 La Posada Dr., Suite 240
Austin, TX  78752 Phone: 512.656.4139

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]