Understanding XSL
by Norman Walsh
January 19, 1999
With that background, let's take a closer look at the style sheet in Example
2. XSL contains many more features than can be covered in an article of
this size. We'll consider just the features needed to write a simple style
sheet for the sample XML document in Example 1.
In order to display the sample document, we must handle five cases:
1. the document element,
2. the document title,
3. paragraphs,
4. emphasis (can be nested),
5. figures.
In this example, we'll use XSL to transform our XML document into HTML (see
Example 3). Each template in our style sheet
"instantiates" a small part of the result tree. XSL knits all of these fragments
together to form the complete result tree.
Example 3: The HTML that results from applying the XSL style sheet in Example 2 to the XML document in Example 1.
<HEAD>
<TITLE>A Document</TITLE>
</HEAD>
<BODY>
<H1>My Document</H1>
<P>This is a <I>short</I> document.</P>
<P>It only exists to <I>demonstrate a
<B>simple</B> XML document</I>.</P>
<DIV>
<B>Figure 1.</B><BR/>
<IMG src="myfig.gif"/><BR/>
<B>My Figure</B>
</DIV>
</BODY>
</HTML>
The Document Element. Since we know that the document element, doc,
always comes first, we'll use it to build the basic structure of our HTML
page. That's what the following rule does:
<xsl:template pattern="doc">
<HTML>
<HEAD>
<TITLE>A Document</TITLE>
</HEAD>
<BODY>
<xsl:process-children/>
<BODY>
</HTML>
</xsl:template>
Every element in the template is either an XSL processing instruction or
is copied literally into the result tree. In this rule, each element is copied
into the result tree until xsl:process-children is encountered.
When xsl:process-children is encountered, the XSL processor
processes each of the children of the current node. For each node, it finds
the matching template and instantiates it. The sequence of instantiated templates
is placed in the result tree at the location of the xsl:process-children
element in the template.
It's perfectly legitimate for a template to contain more than one occurrence
of xsl:process-children. However, the same processing is performed
each time.
The Document Title. For the document title, we simply want to output
an <H1>:
<xsl:templatepattern="doc/title">
<H1>
<xsl:process-children/>
</H1>
</xsl:template>
Note that we've used the pattern "doc/title", which distinguishes
document titles from figure titles.
Example 2 can be extended with the following
templates. You can also view a style sheet that incorporates all the templates.
Paragraphs. Formatting paragraphs is easy:
<xsl:template pattern= "para">
<P>
<xsl:process-children/>
</P>
</xsl:template>
Emphasis. Designating emphasis is a little more interesting because
it can be nested. The following template handles the simple, unnested case:
<xsl:template pattern="em">
<I>
<xsl:process-children/>
</I>
</xsl:template>
If this is the only template for em, the result will be nested
<I> tags in the output. We could rely on the browser to handle
this case, but let's not. The following rule applies boldface to text that
is nested within an already emphasized text segment:
<xsl:template pattern="em/em">
<B>
<xsl:process-children/>
</B>
</xsl:template>
If necessary, additional rules could be added for triply nested emphasis
and beyond.
Figures. Presentation of figures involves a bit more processing. The
goal is to enumerate the figures in a document and present the figure title
as a caption below the graphic (although it appears before the graphic in
the source document).
Here's the template for figure:
<xsl:template pattern="figure">
<DIV>
<B>Figure <xsl:number level="any" count="figure"/>.</B><BR/>
<xsl:process select="graphic"/>
<xsl:process select="title"/>
</DIV>
</xsl:template>
The figure template begins by constructing a DIV.
Every template must construct a single fragment of the result tree, so there
must be a top level wrapper for everything in the figure template. In HTML,
DIV and SPAN are reasonable wrappers; in XSL, sequence
serves this role.
Next we output the word "Figure" and use xsl:number to output
the figure number. The xsl:number processing instruction counts
elements in the source tree. With xsl:number you can select single
or multilevel numbering, which nodes to count, where to start counting, and
the format of the resulting number. In this case, we're counting figure
nodes anywhere in the document (preceding the current node). If our document
were divided into sections or chapters, we might wish to count figures only
within the current section. The result will be an arabic number (1, 2, and
so on) since we did not specify a format.
The xsl:process instruction processes only selected children
(or selected nodes from elsewhere in the tree). The xsl:process
element has a required select attribute. All of the elements
in the source tree that match the pattern specified in the select
attribute are processed, and their instantiated templates are inserted into
the result tree at the location of the xsl:process element. By
default, the select pattern is "anchored" at the current node, but there are
facilities for relative and absolute positioning to move the anchor elsewhere
in the tree.
First the graphic element is processed, then the title.
Technically, these elements process all graphics and all titles
within the figure. If multiple graphics or titles were provided,
a more complex select pattern would be required to process only the first.
(See the "Suggested Exercises" section.)
Formatting Graphics. The graphic element must be transformed into
an IMG tag. Note that the IMG tag is empty and must
therefore use XML empty-element syntax:
<xsl:template pattern="graphic">
<SPAN>
<IMG src="{attribute(fileref)}"/>
<BR/>
</SPAN>
</xsl:template>
The interesting point here is the use of curly braces in the src attribute.
XSL provides the xsl:value-of instruction for computing generated
text. Since elements cannot occur in attributes, curly braces in an attribute
value are treated as calls to xsl:value-of.
The xsl:value-of instruction takes an expression (implicitly
the content of the curly braces), and returns the content of the element or
attribute located by that expression. So the template above places the value
of the fileref attribute on graphic into the src
attribute on IMG.
Formatting Titles. Finally, the title of the figure must be formatted.
Like the document title template, the pattern on this template must be qualified:
<xsl:template pattern="figure/title">
<B>
<xsl:process-children/>
</B>
</xsl:template>
Suggested Exercises
If you're inspired by the examples you've seen so far, here are a few exercises
to consider. Some of them will require additional tools not covered here,
but described in the first Working Draft.
- Rewrite the select patterns in the figure template to
process only the first graphic or title.
- Correctly handle the HTML TITLE element in the HEAD
so that it contains the proper document title rather than a fixed, literal
string.
- Write the style sheet using XSL formatting objects. Using formatting
objects will allow your document to be rendered equally well in a variety
of media, rather than simply with a Web browser.
Conclusion
The first XSL Working Draft substantially defines the XSL language. Although
there is still a long way to go, one only has to look at the original XSL
submission (www.w3.org/TR/NOTE-XSL-970910)
to see how far we've come.
In this article, I've tried to present some of the motivations for XSL, to
demonstrate in a small way its expressive power, and to whet your appetite
to review the Working Draft.
The XSL Working Group will continue to make changes to XSL, some of which
will not be backwards compatible, but it seems likely that the general direction
of XSL can be well understood from the first Working Draft. There are many
important and complex issues that must still be resolved, among them: interactivity,
support (if any) for a more powerful scripting language, further harmonization
of the formatting object semantics, and the definition of many additional
formatting objects.
Copyright
© Web Techniques. All rights reserved.