XSLT Reflection
by Jirka Kosek
November 05, 2003
Many modern programming languages contain a special interface called
reflection. Reflection can be used to programmatically read, modifying,
and create code in a particular language. Because the main purpose of
XSLT is to transform XML documents, and because a XSLT stylesheet is
expressed in the XML syntax, we can use XSLT to manipulate stylesheets
themselves. In the following article I'm going to show you how useful XSLT
reflection can be.
Reading XSLT Code
The most fundamental reflection task is to read code. This is very easy
in XSLT. You can query an XSLT stylesheet like any other XML document, the
only thing you must not forget is to specify the correct XSLT namespace
for all queried elements. The following examples assume that the prefix
xsl is bound to a XSLT namespace
(http://www.w3.org/1999/XSL/Transform).
We can query the source document, which is in fact another XSLT
stylesheet, as any other XML document. For example, to get the total
number of templates in the stylesheet, we can use:
Number of templates: <xsl:value-of select="count(//xsl:template)"/>
We can also create templates that match elements in the source XSLT
stylesheet. For example, to get statistics about keys defined in a
particular stylesheet, we can use the following template:
<xsl:template match="xsl:key">
Key name: <xsl:value-of select="@name"/>
Matches: <xsl:value-of select="@match"/>
Use expr: <xsl:value-of select="@use"/>
</xsl:template>
We can even access the contents of the currently processed stylesheet
by placing the call to document('') function at the beginning
of XPath expression.
Reading XSLT code from a XSLT stylesheet is not tricky. However, that's
not true for generating XSLT code. We cannot generate XSLT elements in
templates directly, as this will confuse the XSLT processor. It cannot
recognize which element should be considered an instruction, controlling
transformation flow, and which element should be just copied to the
output.
One way of overcoming this issue is to use the xsl:element
instruction to emit all elements in the generated XSLT stylesheet. This
approach is a little inconvenient; in order to create a simple
stylesheet like
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
Hello! This text was created by an automatically generated stylesheet.
</xsl:template>
</xsl:stylesheet>
you have to use a rather verbose stylesheet, like
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<xsl:element name="xsl:stylesheet">
<xsl:attribute name="version">1.0</xsl:attribute>
<xsl:element name="xsl:template">
<xsl:attribute name="match">/</xsl:attribute>
Hello! This text was created by an automatically generated stylesheet.
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
The second approach is easier, once you know how to use the
xsl:namespace-alias instruction. This instruction allows us
to remap namespaces after a document is transformed, and thus we can
generate XSLT elements directly using temporary namespace.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xslo="http://www.w3.org/1999/XSL/TransformAlias"
version="1.0">
<xsl:namespace-alias stylesheet-prefix="xslo" result-prefix="xsl"/>
<xsl:template match="/">
<xslo:stylesheet version="1.0">
<xslo:template match="/">
Hello! This text was created by an automatically generated stylesheet.
</xslo:template>
</xslo:stylesheet>
</xsl:template>
</xsl:stylesheet>
You can read more about this namespace aliasing technique in the
article
Namespaces and XSLT Stylesheets by Bob DuCharme, or you can read the
corresponding section in the XSLT recommendation.
Now that we have learned how to read, query, and write XSLT stylesheets
using XSLT, we can utilize our knowledge to do something really
useful.
Convert HTML Stylesheets
to XHTML Stylesheets
An obvious use of XSLT reflection is to refactor existing
stylesheets. Suppose we have a large base of stylesheets that should be
changed in a way that can be algorithmically captured. For example, we may
want to modify existing HTML stylesheets to produce XHTML. In the case of
only one stylesheet such rewriting will be done by hand. But if there is
more than one stylesheet, the XHTML stylesheet should be automatically
derived from the HTML one. Such an automatic derivation can be expressed
in a form of the XSLT transformation.
Let's summarize the main differences between HTML and XHTML:
-
all XHTML elements belong to the namespace http://www.w3.org/1999/xhtml
-
XHTML is an XML language, not an SGML one
-
XHTML uses different public and system identifiers in the
DOCTYPE declaration
Now we must express these changes as changes in the XSLT
stylesheet. The first change means that for each non-namespaced element in
the original HTML stylesheet, we must add the correct namespace. All other
elements should be copied intact. The following template can accomplish
this change for us:
<xsl:template match="*">
<xsl:choose>
<!-- When the element is not in a namespace, then it is HTML element
which should be transformed into a XHTML element in a proper namespace -->
<xsl:when test="namespace-uri(.) = ''">
<xsl:element name="{local-name(.)}"
namespace="http://www.w3.org/1999/xhtml">
<!-- Copy through attributes -->
<xsl:copy-of select="@*"/>
<!-- Process content of the element -->
<xsl:apply-templates/>
</xsl:element>
</xsl:when>
<!-- Other elements (mostly XSLT instructions) are copied through -->
<xsl:otherwise>
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
However this doesn't correct namespace for elements produced by the
xsl:element instruction. Therefore, another template is
needed.
<xsl:template match="xsl:element">
<!-- Copy xsl:element instruction -->
<xsl:copy>
<!-- Copy original attributes -->
<xsl:copy-of select="@*"/>
<!-- Add element to the right namespace -->
<xsl:attribute name="namespace">http://www.w3.org/1999/xhtml</xsl:attribute>
<!-- Process content of the instruction -->
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
To complete it, we must also copy the possible content of elements like
text, comments and processing instructions.
<xsl:template match="comment()|processing-instruction()|text()">
<xsl:copy/>
</xsl:template>
The second thing to do is to change output method from HTML to XML, and
also output the correct public and system identifiers for XHTML. This
behavior is controlled by the xsl:output instruction. The
following template processes this job.
<xsl:template match="xsl:output">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:attribute name="method">xml</xsl:attribute>
<xsl:attribute name="encoding">UTF-8</xsl:attribute>
<xsl:attribute name="doctype-public">
-//W3C//DTD XHTML 1.0 Transitional//EN
</xsl:attribute>
<xsl:attribute name="doctype-system">
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
</xsl:attribute>
</xsl:copy>
</xsl:template>
To handle the cases where xsl:output is missing, we should
test it and create a new xsl:output instruction in the output
XHTML stylesheet.
It seems that the stylesheet is now ready to convert HTML stylesheets
into XHTML ones. But if we try it, we find that generated stylesheets
contain a lot of default namespace declarations in the following form:
<someelement xmlns="http://www.w3.org/1999/xhtml">
Even worse, this declaration is usually repeated in HTML generated by
this stylesheet. This is not an error, but the page is gratuitously long
and it can cause problems for some older browsers.
We can get rid of these superfluous declarations by declaring a XHTML
namespace as a default namespace for the root element of the
stylesheet. This sounds easy, but XSLT doesn't offer any standard way for
creating such declarations. In the most widely used processors, like Saxon
and xsltproc, the following trick works. We can create an element in a
XHTML namespace and store it in a variable. From this variable we can copy
just the namespace axis to the root element, and we will get the
corresponding default namespace declaration here.
<xsl:template match="xsl:stylesheet" >
<!-- Store a temporary element from a XHTML namespace in the variable -->
<xsl:variable name="temp">
<xsl:element name="dummy" namespace="http://www.w3.org/1999/xhtml"/>
</xsl:variable>
<!-- Copy xsl:stylesheet element -->
<xsl:copy>
<!-- Copy just the namespace declarations from a dummy element -->
<xsl:copy-of select="exsl:node-set($temp)//namespace::*"/>
<!-- Copy original xsl:stylesheets attributes -->
<xsl:copy-of select="@*"/>
<!-- Process the content of the original stylesheet -->
<xsl:apply-templates/>
</xsl:template>
The complete stylesheet is a part of sample
files. You can use it to convert almost any HTML stylesheet to a
stylesheet that produces XHTML. The same method is used in DocBook XSL stylesheets to
produce the XHTML version of stylesheets from the HTML version, which is
the version on which real human development is done.
Localization
Without Performance Loss
XSLT is often used to create a web site from XML sources. Many
organizations today need multilingual websites. The common approach in
creating such sites with XSLT is to store locale dependent messages in a
special XML file known as the message catalog. Every time we need to
display a language dependent text, we call a special template with
parameters identifying the current language and the requested text. So
instead of simply typing
<h4>Welcome!</h4>
in the non-localized stylesheet, we call template which returns correct
welcome text in the desired language.
<h4>
<xsl:call-template name="gentext">
<xsl:with-param name="text">Welcome</xsl:with-param>
<xsl:with-param name="lang" select="$currentLang"/>
</xsl:call-template>
</h4>
These gentext templates usually utilize the
document() function to lookup the desired text in an external
message catalog.
This solution has two big drawbacks. Typing a long code for calling a
template is very inconvenient, especially in comparison with writing a
non-localized stylesheet. The second drawback is poor performance. A
stylesheet must repeatedly look for messages in the message catalog during
each server request.
Both of these drawbacks can be easily overcome using a simple
solution. We will not create a real stylesheet, but just a stylesheet
template that can be later transformed into specialized stylesheets for
each language. The stylesheet template will be a real XSLT stylesheet,
which will use elements from a special namespace instead of the text
constants. For example the heading with the welcome text will be written
as
<h4><msg:Welcome/></h4>
Such a template for a XSLT stylesheet can be merged with message
catalog for each language, elements from the msg namespace
will be replaced by a localized text, and we will get the XSLT stylesheet
for each supported language. Such stylesheets contain all localized text
directly, so there is no performance cost. Writing template stylesheets is
also much easier then the common solution using localization templates
invoked at runtime.
Our solution is better overall, only its management requirements are
higher. We must process the stylesheet template into a real localized one
every time a change is made to the template or to the message
catalog. This transformation can be of course expressed as XSLT
transformation and the whole process should be automated by Makefile, Ant
task, or batch file.
Figure 1. From the stylesheet template to localized stylesheets
Let's explore the proposed solution deeper. Message catalogs are simple
XML documents. For each language there is one such a file named after ISO
language code (e.g. en.xml for English, cs.xml
for Czech). The sample catalog looks like this:
<?xml version="1.0" encoding="utf-8"?>
<l lang="en">
<text key="Invoice">Invoice</text>
<text key="Welcome">Hello and Welcome!</text>
<text key="Description">Description</text>
<text key="Quantity">Quantity</text>
<text key="UnitPrice">Unit price</text>
<text key="Subtotal">Subtotal</text>
<text key="Total">Total</text>
</l>
Note that key names correspond to local names of the msg:*
elements in the template stylesheet.
Now we need a transformation to replace all occurrences of the
msg:* elements with the corresponding texts from the
catalog. This can be easily expressed by the following stylesheet. It
copies all stylesheet parts to output unmodified except the
msg:* elements.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msg="urn:x-kosek:schemas:messages:1.0"
exclude-result-prefixes="msg"
version="1.0">
<xsl:output method="xml"/>
<xsl:param name="lang">en</xsl:param>
<xsl:param name="messages" select="document(concat($lang, '.xml'))/l"/>
<!-- Copy stylesheet untouched -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Replace msg:* elements with corresponding entry from message catalog -->
<xsl:template match="msg:*" priority="1">
<xsl:value-of select="$messages/text[@key = local-name(current())]"/>
</xsl:template>
</xsl:stylesheet>
You can download the sample stylesheet template
with message catalogs and other files.
Conclusion
This article showed two real world advantages of XSLT being expressed
in the XML syntax. This allows authors of stylesheets to manipulate with
the XSLT code directly from their stylesheet and utilize it for various
interesting effects. This functionality of XSLT is very similar to the
concept of reflection known from other programming languages.