This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: AW: how can I compare two xml files ?


In message <55CB9FEAE87DD411A83200508BDF0DFE06D179@himail8.hi.bosch.de>,
Mengel Andre (FV/SLM) * <Andre.Mengel@de.bosch.com> writes

>> André,
>> 
>> I have tackled this problem in a slightly different way.  I iterate
>> through the two files, comparing them as I go.  The result is a third
>> XML document which lists the differences between the two files, in a
>> form which I hope can be used to reproduce the earlier version of a
>> document from its current form and this 'differences' document.
>> 
>> The style sheet is attached, in the hope that it might be a starting
>> point for your own work.

>If I have understood you, then two xml-files would be the
>same if the third file generated with your stylesheet would be
>an empty file (Because no differences could be found).

Not quite.  As it stands at present, the third file contains a
'skeleton' of the element structure of the current version of the
document.  You could easily alter the style sheet so that it didn't do
this, but I wanted to know where the changed elements etc. occurred in
the document, and this was an easy way of providing this information. 

>Would this file also be empty if the attributes of two documents
>have different orders ?

Since the order of attributes is essentially undefined, it isn't taken
into account.

>What happens if two documents are the same, but one of them 
>contains ignorable whitespace ?

The process relies on the input that is given to an XSLT processor, i.e.
the output from an XML parser.  If the parser has left the ignorable
white space alone (as it should), then it will be available for
comparison.  Note, however, that the style sheet has a 'normalize'
parameter (true by default) which applies the normalize-space() function
to all textual values before making a comparison, and also ignores
text() nodes which consist solely of white space.

(Since this reply is going to the list, I've attached the style sheet so
the curious can see what I'm talking about!  All due credit to Oliver
Becker, whose 'merging' style sheet has been shamelessly used as the
basis for this work.)

Richard Light.

Richard Light
SGML/XML and Museum Information Consultancy
richard@light.demon.co.uk


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]