XSLT UK 2001 Report

XSLT UK 2001 Report

by Jeni Tennison
April 25, 2001

April 8th and 9th 2001 saw the first conference dedicated to XSLT take place at Keble College in Oxford. While the basis of the conference was XSLT, this didn't stop people talking about the XSL effort in general or about other vocabularies and technologies that work with or against XSLT.

Opening Address

The conference was opened by Norm Walsh from Sun Microsystems, member of the XSL Working Group and maintainer of one of the more complex XSL applications -- the DocBook XSL family, which he talked about later in the day. Norm set the scene for the conference, reminding us of the origins of XSLT and outlining four requirements that will make XSLT and XPath as ubiquitous as XML has become:

  • interoperable tools,
  • cooperative specs,
  • optimizations or compilations of stylesheets, and
  • information set pipelines.

XSLT and the Art of Motorcycle Maintenance

Next up was David Carlisle, from NAG Ltd., one of the editors of MathML and an XSL-List regular. David gave another view of XSLT's heritage, as a functional programming language fitting into the same development path as Scheme or DSSSL. He outlined the benefits of taking a functional approach to presenting information, especially with web-based content, where random access means that you need something that allows you to process only parts of the content and still work reliably (for example, in numbering pages without having to process each page to construct the number). David had the title for his talk thrust upon him, but he still managed to bring in a reference to the seminal book "Zen and the Art of Motorcycle Maintenance" with a quote.

After a while he says, "Can I have a motorcycle when I get old enough?"

"If you take care of it."

"What do you have to do?"

"Lots of things. You've been watching me."

"Will you show me all of them?"

"Sure."

"Is it hard?"

"Not if you have the right attitudes. It's having the right attitudes that's hard."

"Oh."

After a while I see he is sitting down again. Then he says, "Dad?"

"What?"

"Will I have the right attitudes?"

"I think so," I say. "I don't think that will be any problem at all."

And so we ride on and on, down through Ukiah, and Hopland, and Cloverdale, down into the wine country...

Beginners can find XSLT difficult to deal with, especially when they come from a procedural languages background. But XSLT isn't hard if you have the right attitude.

XSLT Design Patterns

I spoke next, representing only myself and drawing on my experience answering questions on XSL-List. I outlined some of the design patterns that have emerged in the use of XSLT. Using examples from an application I worked on for Xi advise bv as an example, I spoke about four levels of design patterns.

application level
combining stylesheets and using XSLT within a wider context -- I specifically talked about getting multiple views of the same data using XSLT
stylesheet level
the flow of processing within the application -- I talked about the differences between push and pull, and how to combine them, and about grouping by position, in hierarchies and by value (using the Muenchian Method)
template level
patterns in instructions such as Wendell Piez's method for repetition and David Allouche's method for normalizing strings
XPath level
expressions for getting unique nodes, for set manipulation and for conditional XPaths, such as Oliver Becker's method

Throughout, I talked about the way that identifying these methods can help us to identify the areas where XSLT and XPath need to be developed.

XSLT Performance

We were then treated to a talk by Mike Kay that highlighted the experiences of implementers. Now at Software AG, he is a member of the XSL Working Group and another regular contributor on XSL-List, but he's probably most well known as the implementer of the Saxon XSLT processor and the author of the XSLT Programmer's Reference.

Mike spoke about XSLT performance. Kay advised that you only need to worry about the performance of XSLT processors or stylesheets if you have business requirements that require a certain throughput or response time, although you might also be concerned about the predictability, tuneability, or scalability of a particular stylesheet.

While he didn't specifically talk about Saxon, Mike showed the basic way an XSLT processor works: taking the XML stylesheet, turning it into a tree, 'compiling' that tree, similarly taking the XML source and turning that into a tree, and then constructing the result tree (theoretically in memory, but often practically outputting it immediately).

Mike described the most important things for XSLT processor efficiency: tight code, name management, XPath queries, XSLT pattern matching, pipelining, and the storage of node sets. He discussed the issues involved in constructing a node tree for XPath/XSLT processing, especially given its differences from the DOM. (XPath node trees don't include CDATA or entity nodes, and there is different handling of whitespace.) He also outlined the Tiny Tree Model that he now uses in Saxon (after seeing a similar technique in Xalan), where transient objects are created from arrays as required. This gives real advantages, allowing run-time decisions about the kinds of access paths that should be stored (for example, you only need to store information about what a node's parent is if you need to access a node's parent).

The areas for future optimization that implementers have barely touched yet are

  • parallel execution, which should be possible as XSLT is side-effect free
  • compilation of stylesheets into byte code, something picked up by Morten Jrgensen in the next talk
  • global optimization of processing flow, as opposed to local optimization of XPaths
  • serial transformations, if it's possible to detect those (parts of) transformations that don't require access to the entire tree
  • exploiting XML schemas

There were some tips for users too:

  • follow good performance engineering practice: record the time a stylesheet takes before and after making each change, and change it back if it doesn't improve
  • use small documents rather than large ones
  • don't assume that the processor makes a particular optimization
  • minimize the number of visits to each node
  • use variables
  • use temporary trees (result tree fragments in XSLT 1.0)
  • use keys
  • don't use xsl:number
  • don't care about the changes that can only give less than 10% improvement

The XSLT Compiler for JVM

Morten Jrgensen, from Sun Microsystems, introduced the XSLT Compiler (XSLTC). XSLTC creates "translets": Java classes that run about 30-200% faster than interpretive XSLT processors and are usually about a quarter of the size of an XSLT processor and stylesheet. Because of their size and platform independence, these translets can run on virtually anything, including handheld machines.

With XSLTC, stylesheets can be compiled into translet bundles, each one of which contains a main class and a set of auxiliary classes for elements that require special handling. These are shipped with an XSLT runtime library, containing a tailored DOM with SAX interfaces for input and output.

For authors using XSLTC, Morten outlined a few tips. The main body of a translet is a switch statement, which each case being a particular match pattern. Authors should therefore keep match patterns simple and, in particular, avoid unioned match patterns. At an application level, developers should take advantage of the cacheability of the DOMs used by XSLTC as XML parsing can take as much as 50% of the total processing time.

XSLTC is still alpha software, but the only outstanding features needed for conformance with XSLT 1.0 are support for simplified stylesheets (where the document element of the stylesheet is not xsl:stylesheet), the namespace axis, and id() and key() functions within match patterns.

[1] [2] [3] Next

Close    To Top
  • Prev Article-XML:
  • Next Article-XML:
  • Now: Tutorial for Web and Software Design > XML > Styles > XML Content
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Geek Tutorial
     

    Blogging Tutorial

      RSS Tutorial
      Podcasting Tutorial
    Graphic Design Tutorial
      Coreldraw Tutorial
      Illustrator Tutorial
      3D Tutorials
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial/ Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial/ Articles
     

    XML Style

      AJAX Tutorial
      XML Mobile
    Flash Tutorial/ Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial/ Articles
      Linux Tutorial
      Symbian Tutorial
      MacOS Tutorial
    Personal Tech
      Hardware Tutorial
      Software Tutorial
      Online Auction