To Tag or Not to Tag
by Patrick O'Kelley
May 26, 2004
The New Variorum Shakespeare and XML
Every generation remakes Shakespeare for itself with new costumes,
new set designs, and new interpretations. But, despite numerous
advances in humanities computing, variorum editions of the works of
Shakespeare have relied on models established well before the
digital age. Since the nineteenth century, scholars have slowly
worked through the plays and sonnets to create definitive, variorum
editions that include the variations of each line of each play as
well as all of the major critical commentary. The undertaking, which
was assumed by the
Modern Language Association (MLA)
in 1936, is obviously mammoth. But until recently, the variorum
editions -- thick with special typographical marks and a complex web
of cross-references -- were prepared solely as print texts running
hundreds of pages long. Later this year, however, the MLA will bring
the New Variorum Shakespeare (NVS) project into the world of XML for
the first time.
What is a Shakespeare Variorum Edition?
Paul Werstine, General Editor of the New Variorum Shakespeare
editions, says, "the essential purpose of the New Variorum
Shakespeare has always been to provide a detailed history of
critical commentary together with an exhaustive study of the
text.... In its inclusiveness and its historical orientation the NVS
differs from regular scholarly editions."
The whole idea of creating Shakespeare variorum editions began with
Horace Howard Furness's (1833-1912) publication of his first
variorum Shakespeare in 1871. Furness, a member of the Shakespeare
Society of Philadelphia, was troubled by the lack of historical
context for variant readings of the plays, and he set out to provide
the bibliographic apparatus to fill this gap. Furness' son, Horace
Howard Furness, Jr. (1865-1930) continued in his father's footsteps,
completing his last Shakespeare editorial work in 1928.
Since 1936 the MLA has shepherded 9 NVS plays into print, with two
more nearing the final stages and others in various stages of
completion. Each is a massive undertaking for an individual scholar,
as the weight of new literary criticism continues to mount each
year. "It is really time consuming. We're talking decades," says
Judith Altreuter, Print and Electronic Production Director at the
MLA.
XML for Shakespeare Variorum Editions?
In the 1990s, the MLA turned to the Center for Electronic Texts in the
Humanities (CETH), then a Princeton/Rutgers organization, to
produce a feasibility study regarding the possible production of
electronic editions to accompany the print versions of the NVS.
The MLA had several reasons to seek advice toward a digital
strategy. Given the physical size of the print format, NVS editions,
like dictionaries, were an obvious candidate for digitization simply
to make them more compact.
At the same time, the electronic form needed to be malleable enough
to suit a variety of research uses. One could imagine multiple
front-ends, each optimized for a particular purpose. While the MLA
did not have specific, immediate clients to serve, they needed to
produce a text that could be adapted quickly to future electronic
research needs.
Finally, some early e-text projects had met with failure as the
proprietary software used to create and read the texts became
outdated. Given the life-cycle of a typical variorum edition (10-30
years), the editors needed to know that the software or encoding
they used at the beginning of the project would still be viable when
the edition was completed.
The CETH researchers (including the author of this article)
proposed Text Encoding
Initiative (TEI) P3 SGML (the most recent update, the 2002 P4
version of the standard, moved to XML while remaining backward
compatible) as the single best solution to the MLA committee's
question of which electronic format to use for the NVS. By the
mid-1990s, the TEI standard already included detailed guidelines for
encoding a rich and complicated play with extensive commentary, like
the NVS editions. The CETH group also prepared a full-demo for the
MLA of an SGML marked-up version of the first few pages of
the New Variorum Antony and Cleopatra, which had been
published in 1990.
The TEI, an "an international and interdisciplinary standard" under
the auspices of the Association for
Computers in the Humanities, provided the academic clout and
computing experience necessary to promote the comprehensive,
long-term, stable solution that the MLA was looking for. And, as the
CETH group demonstrated, TEI encoding could capture the
sophisticated apparatus that accompanied the primary text.
Still, simply deciding to use the TEI was not enough. The MLA
needed to commit resources to determine precisely how the flexible
tagging system would be implemented. And, of course, they needed a
play to work on.
Winter's Tale in XML
After several years of discussion and debate, the MLA is now
developing an official encoding plan. Winter's Tale,
which is expected to be completed in late winter, will be the first
NVS edition to appear since the CETH recommendations were made, and
the MLA confirms that they will, indeed, be providing a CD-ROM in
the back of the print edition that includes a full TEI XML version
of the play. "[TEI XML] seemed like a clear choice for a project of
this scope and importance, because it offers a well-tested basis for
high- quality XML encoding and because it has a strong institutional
and organizational basis (hence likely to exist and be supported
well into the future). There isn't really any other encoding system
that would be adequate," says Julia Flanders.
Flanders, who met MLA's Judith Altreuter at a TEI training
seminar, was taken on as a consultant by the MLA. Flanders has 12
years of humanities computing experience on the Women Writers Project at Brown
University, and the consulting group she works with, Ridgeback, is
creating a specification and detailed documentation for NVS
encoding. It is also doing the actual tagging of the Winter's
Tale in association with Altreuter and the NVS editors.
While the encoding plan is still in prototype mode, Flanders
provided some samples that suggest the direction the Ridgeback group
is going. The tagging will capture several different kinds of
information, she notes:
- "Structural information about the text as a whole"
- "Details of bibliographic references"
- "Cross-references and other linking information"
- "Editorial apparatus"
- "Some renditional information (or rather, encoding that can be used to motivate
formatting: e.g. names that will be highlighted in the print version, foreign-language
words, etc.)"
This excerpt of a few lines from Winter's Tale (which
Flanders stresses is in a "preliminary state") includes the
beginning of a new act (2) and scene (1), a stage direction, and
dialogue. The prefix "tln" is an acronym for "'through line
numbering' which refers to the First Folio lineation and provides
the overall internal reference system for the play text and notes,"
Flanders says.
<div1 type="act" n="2">
<div2 type="scene" n="1">
<lb id="tln.583"/><head type="scene">Actus Secundus. Scena
Prima.</head><note type="asn">2.1</note>
<lb id="tln.584"/>Enter Hermione, Mamillius, Ladies: Leontes,
<lb id="tln.585" n="585"/><stage type="enter">Antigonus, Lords.</stage><?sgmlp
pgbrk pg="140"?>
<lb id="tln.586"/><sp who="Hermione"><speaker>Her.</speaker>
<p>Take the Boy to you: he so troubles me,
<lb id="tln.587"/>'Tis past enduring.</p></sp>
<lb id="tln.588"/><sp><speaker>Lady.</speaker>
<p>Come (my gracious Lord)
<lb id="tln.589"/>Shall I be your play-fellow?</p></sp>
<lb id="tln.590" n="590"/><sp who="Mamillus"><speaker>Mam.</speaker>
<p>No, Ile none of you. </p></sp>
...
</div>
</div>
The <?sgmlp pgbrk pg="140"?> tag references the
actual page break in the print edition, so that XML can be mapped to
PDFs of the print pages if needed at some time. The attribute "asn"
stands for "act/scene number" and provides an alternate
notation. Flanders writes that "the 'n=' attribute seems in this
example to duplicate the tln number, but there are cases where the
two get out of whack, so this isn't as redundant as it looks
here."
The sample textual commentary note that follows demonstrates how
quickly the NVS can become complicated. Here, the note discusses a
fragment of a single line but also cross references the
bibliographic entry for "Brook," creating a new chain of
connections:
<note id="cc.575" target="tln.511">
<p><app><lem>declare</lem></app> <name type="author">Abbott</name>
(§369):
<quote>The Subjunctive after verbs of command [<emph
rend="italic">coniure</emph> (509)]…is especially
common.</quote> See also <ref targType="bibl" target="b.bro76"><name
type="author">Brook</name> (1976, p. 107)</ref>. Cf. n.
1106.</p></note>
The "bibl" reference (a target from the above note) gives a glimpse
at the kind of entries that will populate the full bibliography of
the NVS XML Winter's Tale:
<bibl id="b.bro76"><author>Brook, G[eorge] L.</author>
<title level="m">The Language of
Shakespeare</title><imprint><date>1976</date></imprint></bibl>
While the tagging is going to be solid in the edition that arrives
this fall, the MLA is still debating what, exactly, to package on
the CD-ROM. There will probably not be a front-end to read the XML
text, and some discussion has involved including PDF files linked to
the marked-up play. But Altreuter believes the important thing is
that they have entered the digital age. "We want to put this out
there and let people play with it," she says. "We had to start
somewhere."
Werstine is hoping that members of the humanities computing
community will step up to build on the efforts of the NVS
editors. "It is also our hope that someone will be sufficiently
interested when they get XML version of the Winter's
Tale to see what can be done with it," he says. "We are
hoping that people will give their work to MLA."
The Future of the NVS
So what is down the road for the NVS editions? In the near term,
the MLA will be changing some of its basic processes. "At a minimum,
the XML will be used to generate the printed books," says Flanders,
"but in addition I expect that in the future it may serve as the
basis for some kind of electronic edition to accompany the print. In
such a case, we can imagine that we'd want to provide for various
kinds of searching and analysis, but the specifics remain to be
determined." One could also imagine, as Werstine does, a number of
the texts being made available in a single, searchable database, a
"docuverse". And Professor Braunmuller, the current chairman of the
MLA's NVS committee, is hoping that the NVS editors themselves will
do the markup as part of their preparation of the text.
But the biggest area for innovation is likely to come from the open-endedness
enabled by a digital text. "The moment that you publish [New Shakespeare Variorum
editions], they are out of date," says Altreuter, thinking of all the new scholarship
that hits the presses even a month after a volume is bound and shipped to bookstores.
Though she doesn't have a business model worked out, her personal vision imagines
that the texts could be part of a Website that allows ongoing expansion and
annotation -- a true community effort.
Of course, to make this leap would require grant money of some
kind, since open access to the texts on the Web would preclude the
self-sustaining support that comes from sales of the print
editions. It would also require a shift in the traditional notion of
authorship in an academic edition.
There are precedents for academic editions of Shakespeare made
available on the Web, though on a smaller scale. The Enfolded Hamlet,
for example, provides a simple interface for searching Bernice
Kliman's The Enfolded Hamlet text, which includes both
the Second Quarto and First Folio editions of the play. And the Web
is already home to numerous open-access communities for scholarly
discussion, though the need for editorial filtering still remains a
topic of debate.
For now, the NVS team members are excited to see the first step --
the move to XML -- finally being taken. What role will the NVS
electronic editions play in the academics debates of the coming
years, and how will the editions adapt to the changing technology
remain to be seen. "The scholarly community has to tell which
direction to go," says Altreuter. By building all future NVS
editions on the foundation of XML, though, the MLA has already
helped provide some direction itself.
For more information about the NVS project contact: David
G. Nicholls, Director of MLA Book Publications (