I've written an xml profiling tool

Eric S. Raymond esr@thyrsus.com
Fri Dec 20 19:23:00 GMT 2002


I've written a little wrapper script that uses Jirka Kosek's stylesheet
technique to do XML profiling.   I could release it as a project, but 
it's so close to trivial that that seems kind of silly.

Enclosed is the script and a RefEntry for it.  Tim, do you want to take this
into your xmlto package?  Hack as you like, rename it, whatever; all I want
is for something equivalent to be stock in the next Red Hat release.

If you don't want it, tell me that.  Then I'll go ahead and release it as
a separate package.

(Why the hell is conditionalization called profiling, anyway?  Have I mentioned
recently that XML jargon makes me nauseous?)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>
-------------- next part --------------
#!/bin/sh
#
# xmlprofile -- select portions of an XML document by attribute
#
# Trivial wrapper around Jirka Kosek's stylesheet-transform technique
# for profiling XML documents.  Uses xsltproc or saxon.
#
# Usage: xmlprofile param value file... 
#
# by Eric S. Raymond <esr@thyrsus.com> 18 September 2002

param=$1; shift
value=$1; shift
files="$*"

stylesheet=/usr/tmp/xmlprofile$$.xml
trap "rm -f $stylesheet" 0 1 2 15

# Generate a stylesheet that know about the attribute we're passing in
cat >$stylesheet <<EOF
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<!-- Generate DocBook instance with correct DOCTYPE -->
<xsl:output method="xml" 
            doctype-public="-//OASIS//DTD DocBook XML V4.1.2//EN"
            doctype-system="http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd"/>

<xsl:param name="${param}"/>

<!-- By default, copy all nodes in document -->
<xsl:template match="@*|*|text()|comment()|processing-instruction()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

<!-- Elements which have $param attribute set must be treated separately -->
<xsl:template match="*[@${param}]">
  <xsl:variable name="${param}.ok" select="not(@${param}) or
                contains(concat('|', @${param}, '|'), concat('|', \$${param}, '|')) or
                @${param} = ''"/>
  <xsl:if test="\$${param}.ok">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:if>
</xsl:template>

</xsl:stylesheet>
EOF

# Apply the generated stylesheet using whatever XSLT engine is handy
if which xsltproc >/dev/null
then
    for file in $files
    do
	xsltproc --novalid --stringparam $param $value $stylesheet $file
    done
elif which saxon >/dev/null
then
    for file in $files
    do
	saxon $file $stylesheet "${param}=${value}"
    done
else
    echo "xmlprofile: couldn't find an XSLT engine!" 1>&2
    exit 1
fi

exit 0
# End
-------------- next part --------------
<!DOCTYPE refentry PUBLIC 
   "-//OASIS//DTD DocBook XML V4.1.2//EN"
   "docbook/docbookx.dtd">
<refentry id='xmlprofile.1'>
<refmeta>
<refentrytitle>xmlprofile</refentrytitle>
<manvolnum>1</manvolnum>
<refmiscinfo class='date'>Sep 18 2002</refmiscinfo>
</refmeta>
<refnamediv id='name'>
<refname>xmlprofile</refname>
<refpurpose>conditionalization (profiling) for XML documents </refpurpose>
</refnamediv>
<refsynopsisdiv id='synopsis'>

<cmdsynopsis>
  <command>xmlprofile</command> 
	<arg choice='plain'>attribute</arg> <arg choice='plain'>value</arg>
        <arg rep="repeat">file</arg>
</cmdsynopsis>

</refsynopsisdiv>

<refsect1 id='description'><title>DESCRIPTION</title>

<para>This tool supports conditionally including or excluding sections
from XML documents, implementing a facility similar to SGML marked
sections.  To use it, specify an attribute name, a value, and a list
of files.  Each file will be processed in turn and the results sent to
standard output.</para>

<para>For each tag pair that has a binding of the attribute in its
header tag, that tag pair and enclosed contents will be present in the
output if and only if the attribute value in the tag matches the
required value passed in on the <command>xmlprofile</command> command
line.</para>

<para>A required value matches an attribute value if either (a) they
are equal strings, or (b) the attribute value contains or-bars and
the required value matches one of the or-bar-separated substrings.</para> 
</refsect1>

<refsect1 id='authors'><title>AUTHORS</title>
<para>The generated-stylesheet technique for profiling was invented
by Jirka Kosek.  This implementation is by by Eric
S. Raymond <email>esr@snark.thyrsus.com</email>.</para>
</refsect1>

<refsect1 id='see'><title>SEE ALSO</title>
</refsect1>

</refentry>



More information about the Docbook-tools-discuss mailing list