Adobe's InDesign and XML
by David Miller
August 04, 2004
The process of formatting and typesetting documents has come a long
way in the relatively short span of time that modern computers have
been around; the process has typically revolved around formatting
scientific or technical documents using a variety of
command-line tools. However, the latest version of Adobe's page-layout application, InDesign, integrates XML files into its visually oriented publishing workflow.
Old-School Document Generation
One of UNIX's first killer applications was document
typesetting. While making documents is no longer the sole reason to
invest money in a box that flips bits as fast as possible, the purpose
behind it remains just as relevant today as it did 30 years ago when troff first hit the
scene: to offload as much work as possible to the machine,
thereby allowing the author(s) to concentrate on the content
of the document, rather than its appearance.
troff typesetting is achieved by feeding a
specially formatted document to a series of
preprocessors, repeatedly piping the output of one
preprocessor to the input of the next. This preprocessing will continue until the content of the original
document has been suitably massaged and can parsed by the troff
utility, which in turn produces a PostScript copy of the document.
New-School Document Generation In a Nutshell
A New Tool Is Born
Fast-forward 20 years from the birth of UNIX, when
HTML is being sent across the Internet at an exploding rate. Realizing that they might be on to something, the W3C simplifies and standardizes HTML's ancestor, SGML, to become XML. Serving as a general-purpose data-exchange format, XML syntaxes are created for countless applications, including vector graphics (SVG), remote procedure (XML-RPC) invocation, and document formatting (XSL-FO).
While there are many simple XML applications, XSL-FO definitely doesn't fit this classification. This is simply due to the complex nature of controlling the layout for a
multiple-page document; headers and footers, margins, all sorts of font properties (its family, size, weight, leading, kerning,
alignment, and countless other details) must be taken into account to
transform a marked-up document into one that can be
professionally printed and bound.
To avoid interacting with this level of complexity, authors typically
mark up their content with another syntax (such as DocBook), which is then transformed with XSLT into
another syntax for presentation -- creating a pipeline of XML tags that are passed between processors,
(similar to troff's pipeline). And while XSL-FO is well-suited for documents
such as theses, manuals, and other technical
references -- documents that typically have a simple layout with a
restricted set of fonts, colors, and external resources -- it is
virtually impossible to format anything that doesn't fit within
a rectangular box.
However, this "box" limitation isn't an
oversight; the W3C never intended the XSL-FO specification to fill any other shoes. After all, anyone who is in need of a customized XML workflow to suit their needs is free to create it, which is exactly what Adobe has done with the latest release of its Creative Suite software.
In Steps InDesign
InDesign, a component of Adobe's Creative Suite, is a
page-layout program aimed at Adobe's core audience:
professional graphic and media designers. Whereas XML and troff pipelines are primarily used to create technical documents, InDesign is primarily used to create visual documents such as brochures, advertisements, and other media (there are exceptions, of course).
In this brief tutorial, we'll walk through a simple example of
how InDesign CS allows a publishing workflow to be broken into two
pieces: developing the document's appearance and then its
content.
Styles
Applications that interact with documents (whether it be InDesign,
an XSL-FO processor, web browser, or even Microsoft Word) use the concept of styles, which allow any number of
components of a document to have their appearance governed by a style declaration. Typically a style declaration controls an object's appearance: color, spacing, and other properties
that are dependent on the application's domain.
XSL-FO styles are created using XML attributes, many of which are analogous to CSS-style declarations (such as font-family and border-left-width). InDesign allows authors to define
styles in a manner that is familiar to anyone who has worked with
similar design programs; the following image shows how basic character
properties of the "employeeInfo" style are defined:

Figure 1: A screenshot illustrating how paragraph styles
are defined through InDesign's interface.
Adding XML to the Mix
Among other new features included in this version of the application,
InDesign CS introduces an XML compatibility layer that allows it to import
and export XML
files that encapsulate the information contained within the document.
In the previous version of InDesign, the written material of our
documents was tied up in the text blocks that were defined in the
document itself (indeed, this is the case for most files that are
manipulated with a proprietary application); similarly, the only way
to manipulate other resources used within the document, such as
images, was through the application's interface. This is no
longer the case with InDesign CS.
InDesign's XML Savvy
In much the same way that CSS provides a way to abstract the appearance of
an XHTML document, InDesign
CS' XML-compatible layer provides a powerful way to separate the content
and style of print documents that, in the past, have typically had
their content and presentation tied up in a knot.
Essentially, this layer allows you to define the template of your
document from within InDesign, and to develop the content of the
document in any text editor. Once the template and content have been
finalized, it is simply a matter of merging the two together, and
exporting the final work to the format of your choice.
A Simple Scenario
To see how InDesign's XML layer can ease a publishing workflow,
let's walk through a simple scenario where a small company needs
business cards printed for each of its employees. While the cards will
share some information (such as the company's name), each card
will have unique information for the rest of the fields (such as the
employee's name, email address, etc.).
Traditionally, this process would be completed in one of two ways,
each with its own benefits and drawbacks:
- By creating one InDesign file for each employee and copying and pasting the common information between the documents
while keeping the employee's unique information separate,
or...
- By creating one InDesign file for the entire company, and
merely changing the unique information for each employee by cutting and pasting before we export the document to a printable
format.
However, the two methods outlined above also share the same problem:
if, at any point after exporting the file, we make a change to the
template, then the change must be reflected in every file
that uses the template. InDesign alleviates this problem by allowing
us to define a template and an XML file to provide the necessary information for
a document. Thus, the XML file acts as a kind of simple data store,
allowing the data to be created, manipulated, or used with other
applications.
A Simple How-To
The first step in our workflow is to define the template we'll be
using for our business cards. In order to do this, we'll have
to:
- Set up the layout of the business card, thereby creating the static
content that will be shared among all of the cards.
- Specify the text blocks (and optionally any images) that will vary
from one card to the next (in our example this will consist of the
employee's name, address, email, and web site).
- Define the styles that will be applied to control their
appearance.
This process can be seen in Figure 2; the employee's email
address, email, and phone number have been designated as variables
using the InDesign Tags palette, and are mapped to
the "employeeInfo" style (defined in Figure 1), while the
"employeeName" variable is mapped to a style of the same
name.

Figure 2: Defining how the elements of our XML file will
appear using the styles declared in our InDesign document.
As you can also see in the figure above, the four variables for each
of our business cards are represented as sibling elements under a
common parent element named Root. Following the rule of
least surprise, the XML document that we will import must
follow the same structure. The following code is a listing of
the XML
documents that represent two of our company's employees:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Root>
<!-- contents of dave.xml -->
<employeeName>davidfmiller</employeeName>
<employeeEmail>davidfmiller@gmail.com</employeeEmail>
<employeeAddress>1234 main street calgary, ab, ca a1b 2c3</employeeAddress>
<employeePhone>(403) 555-5555</employeePhone>
</Root>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Root>
<!-- contents of dylan.xml -->
<employeeName>dylan mckay</employeeName>
<employeeEmail>joey@fivevoltlogic.com</employeeEmail>
<employeeAddress>1234 main street calgary, ab, ca a1b 2c3</employeeAddress>
<employeePhone>(403) 555-4444</employeePhone>
</Root>
Importing our XML file is simply a matter of bringing up a
contextual menu and locating the appropriate file on disk.

Figure 3: Importing an XML
file to populate the placeholders with an employee's information.
After importing each XML file, the placeholders of our InDesign
template will be populated with the corresponding data from the
imported file. Thus, exporting a print-ready business card for
all of the company's employees is only a few mouse clicks away.

Figure 4: The final series of business cards.
Closing Tag
Adobe has applied the template concepts used in other XML technologies to
InDesign's visual environment (and made them accessible to
designers in the process), allowing InDesign to cooperate with
applications on a level that was previously impossible.
And while this tutorial provided a very brief glimpse into
InDesign's XML capabilities, it is by no means a
comprehensive resource; interested readers can find more
information from Adobe.