This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Elements and attributes (was Re: XML Schemas and docbookdocuments)


/ Karl Eichwalder <ke@gnu.franken.de> was heard to say:
| Norman Walsh <ndw@nwalsh.com> writes:
| > | <article
| > |   <class>whitepaper</>
| >
| > I've seen this suggested elsewhere recently. Specifically, that we
| > could introduce a new syntax using some prefix character for
| > "attribute elements".
| 
| Do a groups google search for <author/naggum/, <group/comp.lang.lisp/,
| <subject/XML and lisp/, <year/2001/ -- within this thread Erik Naggum
| argues once again the concept of elements and attributes causes trouble
| but doesn't solve problems.  I'm inclined to believe he's right.

I think the inherently unordered nature of attributes and the inherently
ordered nature of elements ("&" content models notwithstanding) makes them
distinct.

Some things just seem to "belong" in attributes. Class values on article
being a good example, I think

  <article class="whitepaper">

is just better in some difficult to specify way than

  <article><class>whitepaper</class>

But maybe I've just been doing this too long. It can surely be argued
that there are no applications where all attributes couldn't be
replaced by elements (though the converse is not true).

On the subject of elements and attributes, James Clark gave a really
fine keynote at XML 2001. One of the topics he touched on briefly was
the problems with attributes and structure. From his keynote and in
the course of conversation during the week with James and others about
the entity references in attributes problem, I think I've formed one
really good rule of thumb about attributes.

"Thou shalt not put prose in attributes"

In other words, attributes should be reserved for lists of tokens, numbers,
simple datatypes, etc.

  <equation alt="Some prose description">

is always wrong: for internationalization, structure is always
required in text. Consider bidi for Arabic, Hebrew, and other writing
systems that introduce the need for mixed-direction typesetting and
additional marks like rubi required in (some?) Japanese.

One nice consequence of this rule is that the need for entity
references (most frequently character entity references) in attribute
values is effectively be removed. (Yes, I've used entity references in
attributes for other things, but not often and never without other
possible solutions.)

This clears the way for an element-based approach to character entities
for schema languages that don't provide for entity declarations. I'm
thinking of

  <e:eacute/> or <e:unicode name="latin_capital_letter_E_with_acute"/>
  (and maybe even <e:isochar name="eacute"/>)

for &eacute;

There are lots of reasons why this is less perfect than &eacute; but I
think there are lots of reasons why it's better than &#x00C9; for many
applications.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com>      | Everything we love, no doubt, will
http://www.oasis-open.org/docbook/ | pass away, perhaps tomorrow,
Chair, DocBook Technical Committee | perhaps a thousand years hence.
                                   | Neither it nor our love for it is
                                   | any the less valuable for that
                                   | reason.--John Passmore


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]