A Brief History of SOAP
by Don Box
April 04, 2001
It's been a little more than three years since I first started
working in XML in general and SOAP in particular. For the past year
or so, my own SOAP work has been pretty minimal, mainly because
without a stable XML Schema specification, the thought of building
tons of SOAP support plumbing seems pretty futile. Now that the XML
Schema WG has more or less completed its work, it's time to get back
to work (for me at least).
My first "official" act in this next phase of SOAP's development is
to take a few minutes to retrace the steps that got us here. Hence
this article.
In the Beginning: SOAP 98
When SOAP started in early 1998, there was no schema language or
type system for XML (in fact, XML 1.0 had just become a full
Recommendation that quarter). If you look at earlier versions of the
SOAP spec (including XML-RPC,
which was subsetted from the 1998 SOAP spec), most of the focus was on
defining a type system. The original type system of SOAP (and XML-RPC)
had a handful of primitive types, composites that are accessed by name
(a.k.a. structs) and composites accessed by position
(a.k.a. arrays). Once we had these representational types in place, we
modeled behavioral types by defining operations/methods in terms of
pairs of structs and, at least on the DevelopMentor and Microsoft
sides, aggregated these operations into interfaces. Hence the RPC
flavor that people associate with SOAP.
Got a question or a comment on SOAP?
Have your say.
Was SOAP the first attempt to add a behavioral type system to XML?
Not at all. I recall scanning the landscape at the time. The existing
proposals either assumed a COM type system underneath (unacceptable,
since even back in 1998 we knew COM wasn't the ultimate type system)
or were very EDI-like, which would alienate parts of the development
community. For that reason, we looked at the existing serialization
formats (ASN.1 BER, NDR, XDR, CDR, JRMP) and RPC protocols (GIOP/IIOP,
DCE/DCOM, RMI, ONC) and tried to hit the sweet spot that would satisfy
the 80% case elegantly but could be bent to adapt to the remaining 20%
case.
So why didn't we ship SOAP back in 1998? That one's easy: Microsoft
politics.
The original contributors to SOAP within MS worked on the COM/MTS
team. At the same time, the XML group within MS was working on
XML-Data, which became one of the many seeds for the XML Schema
language we know today. As is often the case in large companies, the
two groups within MS didn't see eye to eye, so public support for SOAP
got shelved within MS for some period of time. (As a side note, I was
one of those people who didn't get XML-Data when I first encountered
it, and I have publicly apologized to Andrew Layman at least twice for
being so dense.)
Unwilling to let the slow process of getting MS to act on SOAP
beyond a press release, Dave Winer went out on his own and shipped the
XML-RPC specification based on
subsetting the original SOAP type system. I spent the rest of the year
working on Java metadata grunge, including among other things a
projection of Java class files onto XML.
SOAP Phase 2: 1999-2000
By the time a SOAP specification finally shipped using the name
"SOAP" (4Q1999), the W3C XML Schema language was by no means done, but
it certainly had progressed to a point where it became obvious to most
of the SOAP authors that we needed to leverage and integrate the work
of the Schema Working Group as much as possible. Their primitive types
were a superset of what we needed for SOAP. Their composite type
system was mostly a superset of what we needed for SOAP. It would
have been folly to ignore their work.
Ideally SOAP would have taken the representational type system of
XML Schemas verbatim and simply added the notion of behavioral types
and operations/methods. Unfortunately, XML Schemas lacked (and still
lacks) support for synthetic types such as typed references and
arrays. While you can define things that look like typed references
and arrays in the schema language, these constructs are not really
native to XML Schemas. Worse, you would need to predefine these
reference and array types, which makes it really difficult to
isomorphically move back and forth between say a Java class and an XML
Schema complex type. For that reason, SOAP needed to augment the type
system with the soap:reference and soap:Array
types. It is interesting to note that the Schemas Working Group tried
to tackle the typed reference issue; but, unfortunately, it couldn't
converge on a solution that would support typed references as they
appear in most programmatic type systems.
Most of what the 4Q1999 SOAP specifications did was simply
illustrate how to model typed references and arrays in the W3C XML
Schema type system. Period. We also had a model for adding optional
and mandatory protocol headers (like CORBA's service contexts and
DCOM's ORPCTHIS/THAT), but that was it. Frankly, had the schema
specification been a full REC in 4Q1999, the SOAP specification would
have at most 3-4 pages. However, the XML Schema specification was
changing radically with each successive Working Draft, so those of us
working on SOAP had to deliberately insulate ourselves from the churn
that was W3C XML Schema during 1999 and 2000 in order to make any
progress whatsoever.
To me, the biggest technical issue that faced SOAP in 1999 and 2000
was the lack of metadata. DevelopMentor tried to introduce a simple
metadata format (
CDL) that was isomorphic with the XML Schema type system, yet
didn't tie us to the rather fluid schema language. Dave Winer
totally balked at the idea of metadata , indicating that
human-readable descriptions were all that was needed. Certainly folks
like Eric Raymond seem
to agree with him. The reason we abandoned CDL, however, was a
discussion with Gopal Kakivaya of Microsoft, who convinced us that
what we needed could be achieved by annotating XML Schemas with
additional SOAP-specific hints that were allowed (and in fact
anticipated) by the Schema specification. At this point, DevelopMentor
joined the Schemas WG and most of our effort internally moved towards
XML Schema support. We use XML Schema a lot around DM, having shipped
an
XSD compiler for C++ that we use internally.
The biggest non-technical issue that faced SOAP in 1999 and 2000
was the hideous nature of vendor wars. The FUD that flew around the
trade press and vendor web sites was downright embarrassing. I
recently ran across a Sun Reality
Check that made me ill. In particular, the following quote blew me
away:
SOAP has changed a lot. It started to become interesting to us
when IBM made additions to the mediocre specification that Microsoft
initially championed (you're right, we thought that specification was
a bad idea).
If SOAP/1.0 (the last pre-IBM version) was a bad idea, then so was
SOAP/1.1 (the first post-IBM version, which was submitted to the
W3C). There were no major improvements to SOAP from 1.0 to 1.1. The
specification was reorganized to make the modular design of SOAP more
apparent. However, the few minor technical changes we made were
arguably a step backward (in fact, I believe to date there are no SOAP
implementations that do anything meaningful with the one new feature,
SOAPActor).
The Post-SOAP Era: 2001 and beyond
So where are we now? Tough question. Here are some observations
about the current state of play.
The XML Schema specification is stable and now a
Proposed Recommendation.
To me, this is the most important
advance for people who care about XML protocols and messaging in
general and SOAP in particular. The fact that
no major changes can be made before advancing to a full W3C
recommendation means that the industry at large knows what they are
dealing with when it comes to applying types to XML. I have stated
before, and I still stand by my belief, that without XML Schemas, XML
is a balkanized standard and its utility for software, component, or
service integration is fairly minor. The Schema specification does
most of the heavy lifting for SOAP, and it kills me that we can't do a
SOAP/1.2 to address the new schema language. Which brings me to my
next observation.
The W3C now has a XML Protocol Working
Group.
SOAP is now where it belongs. Until we got W3C buy-in, vendors were
skittish given the nature of the industry. Now that SOAP has been
subsumed into the XML Protocol work, the big vendors have (for the
most part) stopped arguing about SOAP and we have a fairly open
process for beating the protocol into shape. In my opinion, one of the
smartest things the WG did was to immediately define their
relationship to the XML Schema type system.
We are somewhat closer to having a standardized
metadata format for SOAP.
While far from perfect, WSDL is
as close as we've ever come to having a workable metadata standard
that more than three people can agree on for longer than a week at a
time. Is WSDL perfect? Not by a long shot. Is it workable? For the
most part, yes. Does SOAP/XML Messaging make sense without something
like WSDL? No way. My own criticisms of WSDL relate to
WSDL's current form having a somewhat schizophrenic relationship to
XML Schema. (In fact, there are several ways in which WSDL and XML
Schema are completely incompatible.) Despite my criticisms, portions
of WSDL are more than workable, albeit overly verbose and
indirect, for every SOAP scenario or application I have dealt with in
the past 3 years. Hopefully the XML Protocol Activity will focus on
finishing the WSDL specification and give the world at a reasonable
way of describing, validating, and automating XML-based services.
For the most part, people have stopped arguing about SOAP
SOAP is what most people would consider a moderate success. The
ideas of SOAP have been embraced by pretty much everyone at this
point. The vendors are starting to support SOAP to one degree or
another. There are even (unconfirmed) reports of interoperable
implementations, but frankly, without interoperable metadata, I am not
convinced wire-level interoperability is all that important. It looks
like almost everyone will support WSDL until the W3C comes down with
something better, so perhaps by the end of 3Q2001 we'll start to see
really meaningful interoperability.
Epilogue
SOAP's original intent was fairly modest: to codify how to send
transient XML documents to trigger operations or responses on
remote hosts. Because of our timing, we were forced to tackle issues
that the Schemas WG has since solved, which caused the "S" in SOAP to
be somewhat lost.
At this point in time, I firmly believe that only two things are
needed for mid- to long-term convergence:
- The XML Schemas WG should address the issue of typed references and
arrays. Adding support for these two synthetic types would obviate the
need for SOAP section 5. These constructs are broadly useful outside the
scope of messaging and RPC applications, so it makes sense that the Schemas WG
should address this.
- Define the handful of additional constructs needed to tie the
representational types from XML Schemas into operations and WSDL-style
portTypes.
WSDL comes close enough to providing the necessary behavioral
constructs to XML Schemas, and I am cautiously optimistic that
something close to WSDL could subsume SOAP entirely. I
strongly encourage you to study the WSDL specification and
submit comments, improvements, and errata so we can get convergence and
interoperability in our lifetime.