
Declaring Keys and Performing Lookups
by Bob DuCharme
February 06, 2002
When you need to look up values based on some other value --
especially when your stylesheet needs to do it a lot -- XSLT's
xsl:key instruction and key() function work together
to make it easy. They can also make it fast. To really appreciate the
use of keys in XSLT, however, let's first look at one way to solve
this problem without them. Let's say we want to add information about
the shirt elements in the following document to the result
tree, with the color names instead of the color codes in the
result.
<shirts>
<colors>
<colorcid="c1">yellow</color>
<colorcid="c2">black</color>
<colorcid="c3">red</color>
<colorcid="c4">blue</color>
<colorcid="c5">purple</color>
<colorcid="c6">white</color>
<colorcid="c7">orange</color>
<colorcid="c7">green</color>
</colors>
<shirtcolorCode="c4">oxfordbutton-down</shirt>
<shirtcolorCode="c1">polyblend,straightcollar</shirt>
<shirtcolorCode="c6">monogrammed,tabcollar</shirt>
</shirts>
We want the output to look like
blueoxfordbutton-down
yellowpolyblend,straightcollar
whitemonogrammed,tabcollar
The following stylesheet has an xsl:value-of instruction
that uses an XPath expression to retrieve the contents of the
colors element's appropriate color child. It does
this by finding, for each shirt element, the color
element whose cid attribute value matches the shirt
element's color attribute value. (For example, it takes the
color value of "c4" for the first shirt element and
searches through the colors element's color children
to find one with a cid attribute that has that same value:
the one with "blue" as its contents.) Above that
xsl:value-of element, an xsl:variable instruction
sets the shirtColorCode variable equal to the shirt
element's color attribute value, and the XPath expression has
a predicate of [@cid = $shirtColorCode] to get only the
color element whose cid attribute has the same value
as the shirtColorCode variable.
<xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:outputmethod="text"/>
<xsl:templatematch="shirt">
<xsl:variablename="shirtColorCode"select="@colorCode"/>
<xsl:value-of
select="/shirts/colors/color[@cid=$shirtColorCode]"/>
<xsl:text></xsl:text><xsl:apply-templates/><xsl:text>
</xsl:text>
</xsl:template>
<xsl:templatematch="color"/>
</xsl:stylesheet>
This produces the desired output, but the complexity of the XPath
expression means that if you have a lot of shirt elements
whose colors need to be looked up, creating the result tree could go
slowly. Declaring and using keys can make it go much faster, because
an XSLT processor that sees that you've declared a key usually sets up
an index in memory to speed these lookups. Doing it this way can
produce the same result as the previous stylesheet much more
efficiently.
The next stylesheet does the same thing as the previous one by
using the xsl:key instruction to declare the nodes and values
used for the color name lookups and the key() function to
actually perform the lookups.
<xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:outputmethod="text"/>
<xsl:keyname="colorNumKey"match="color"use="@cid"/>
<xsl:templatematch="colors"/>
<xsl:templatematch="shirt">
<xsl:value-ofselect="key('colorNumKey',@colorCode)"/>
<xsl:text></xsl:text><xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>

How are you using xsl:key and key() in your XSLT work? Share your experience in our forums.
Post your comments
The xsl:key element has three attributes:
-
The name attribute holds the name of the lookup key. The
key() function uses this name to identify what kind of lookup
it's doing.
-
The match attribute holds a match pattern identifying the
collection of nodes where the lookups will take place. In the
example, the color elements are this collection. The fact
that they are enclosed by a colors element gives the source
document a little more structure, but it's not necessary for the key
lookups to work.
-
The use attribute specifies the part or parts of the
match attribute's collection of nodes that will be used to
find the appropriate node -- in other words, it specifies the index of
the lookup. In the example, this index is the cid attribute
of the color elements, because a lookup will pass along a
color ID string to look up the corresponding color name.
Using an xsl:key element and key() function.
The diagram above shows the four steps that take place for one
particular lookup:
The xsl:value-of element for the shirt template has a
key() function that says "pass the colorCode
attribute value to the colorNumKey key to get this
value".
For the oxford button-down shirt element, this value
is "c4".
The colorNumKey element sends the XSLT processor to
look for this value in the cid attributes of the
color elements.
It finds it and returns the element's value for the
xsl:value-of element to add to the result tree.
If these color IDs and names were in a table, you could think of
the table as the "colorNumKey" lookup table, the nodes named by the
match attribute as the rows of the table, and the value or
values named by the use attribute as the index field (or
fields) of the table.
These color elements would fit nicely into a table, but
the beauty of doing this with XSLT (and XML) is that the elements
named by your match attribute can have structures that are
much more complex than any relational database table row. You have the
full power of XML available, and the ability to use an XPath
expression in the use attribute lets you identify any part of
that structure you want to use as the lookup key.
The key() function performs the actual lookup. It takes a
value, searches through the keys for one whose use value is
equal to the one it's looking for, and returns the element or elements
that have that key value. The example's template rule for the
shirt elements calls this function to insert the color name
before each shirt element's contents. The two arguments it
passes to this function are the name of the key ("colorNumKey", the
name of the lookup "table") and the value to use to look up the
needed value: the shirt element's colorCode
attribute value.
Because the key() function returns the node or nodes that
the lookup found, you can use the function call as part of an XPath
expression to pull an attribute value, subelement, or other subnode
out of the returned node. For example, if the color elements
had a PMSnum attribute, and you wanted to insert this
attribute value instead of the color elements' actual
content, you could use a value of "key('colorNumKey',@color)/@PMSnum"
for the xsl:value element's select attribute.
Because the entire color node was used in the example above,
its character data contents (the part between the color
start- and end-tags) got added to the result tree.
Let's experiment with this color lookup table a little more. The
following template demonstrates several things you can do with
declared keys in XSLT using the same shirts source document
as the last example.
<xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:outputmethod="text"/>
<xsl:keyname="colorNumKey"match="color"use="@cid"/>
<xsl:keyname="colorKey"match="color"use="."/>
<xsl:variablename="testVar">c4</xsl:variable>
<xsl:variablename="keyName">colorKey</xsl:variable>
<xsl:templatematch="colors">
LookingupthecolornamewiththecolorID:
c3'scolor:<xsl:value-ofselect="key('colorNumKey','c3')"/>
c4'scolor:<xsl:value-ofselect="key('colorNumKey',$testVar)"/>
c8'scolor:<xsl:value-ofselect="key('colorNumKey','c8')"/>
c7'scolors:
<xsl:for-eachselect="key('colorNumKey','c7')">
<xsl:value-ofselect="."/><xsl:text></xsl:text>
</xsl:for-each>
LookingupthecolorIDwiththecolorname:
blue'scid:<xsl:value-ofselect="key('colorKey','blue')/@cid"/>
black'scid:<xsl:value-ofselect="key($keyName,'black')/@cid"/>
gray'scid:<xsl:value-ofselect="key('colorKey','gray')/@cid"/>
</xsl:template>
<!--Don'tbotheroutputtingshirtcontentsforthisexample.-->
<xsl:templatematch="shirt"/>
</xsl:stylesheet>
Before discussing what it does, let's look at the result it
creates.
LookingupthecolornamewiththecolorID:
c3'scolor:red
c4'scolor:blue
c8'scolor:
c7'scolors:
orangegreen
LookingupthecolorIDwiththecolorname:
blue'scid:c4
black'scid:c2
gray'scid:
The first three xsl:value-of instructions use the same
"colorNumKey" key that the previous example did. The first
xsl:-value-of instruction passes the literal string "c3" as
the index value to look up, and the result shows that "c3" is the key
for the color "red". The second shows how a variable can be used for
this argument to the key() function: an xsl:variable
instruction near the beginning of the stylesheet declares a
testVar variable with a value of "c4", and when the XSLT
processor uses this variable to look up a color name, the result shows
that this finds the color "blue".
The third xsl:value-of instruction in the stylesheet
passes the string "c8" to use for the lookup, and there is no
color element with a cid attribute value of "c8", so
nothing shows up in the result tree after "c8's color:".
The next part of the template looks up the value "c7". The document
has two color elements with a cid value of "c7", so
the template uses an xsl:for-each instruction instead of an
xsl:value-of one to add both of them to the result tree. (If
it had used xsl:value-of, only the first would have appeared
in the result.) A key() function can return multiple nodes,
and this one does, so the xsl:for-each instruction iterates
through the "c7" nodes, printing the value and a space (using an
xsl:text element for the latter) for each.
Also in Transforming XML
Automating Stylesheet Creation
Appreciating Libxslt
Push, Pull, Next!
Seeking Equality
The Path of Control
The beginning of this stylesheet declares two keys: "colorNumKey"
is the same one we saw in the previous stylesheet, and "colorKey" is
the one used by the remaining xsl:variable instructions in
this new stylesheet. Its use attribute names the
color elements' contents (".") as the lookup index,
and each of the three xsl:value-of elements pass this key a
color name to look up the node instead of passing a string to match
against the color elements' cid values. The entire
color node still gets returned, and these three
xsl:value-of elements each pull cid attribute value
out of this node by adding a slash and "@cid" to make a second
location step for the XPath expression in each xsl:value-of
element's select attribute.
So, instead of passing a color ID value to get a color name, these
last three lookups are each passing a color name to get a color
ID. They're looking up the same type of node in the same set of nodes
using a different part of those nodes as the lookup index. Getting
back to the table analogy, it's like looking up rows in the same table
that we used before but using a different column as the key field.
The first of these last three lookups passes the string "blue", and
the XSLT processor adds "c4" as the corresponding color ID to the
result tree. The second passes the string "black", but unlike any of
the lookups before, this one identifies the key name by using a
variable instead of a hardcoded string: $keyName, which was
set to "colorKey" near the beginning of the stylesheet. This causes no
problems, and the "c2" color ID corresponding to "black" gets added to
the result tree.
The last key() function call tries to look up the color
name "gray", and there is none in the key. The function returns
nothing, and nothing gets added after the text node "gray's cid" in
the result tree.
The lookup keys don't have to be in the same document as the
elements that trigger the lookup. If the example document's
colors element had been in a separate document in a separate
file, you could still declare its contents as a key and use it for
looking up the shirt colors in this document. This ability to
look something up in an external data source lets you develop some
very powerful document processing systems. In the next "Transforming
XML" column, we'll see how to read in multiple documents and, among
other things, use one for lookups like these. (If you're in a real
hurry to find out how, see my book XSLT Quickly, from
which these columns are excerpted.)