Second Generation Web Services
by Paul Prescod
February 06, 2002
In the early days of the Internet, it was common for enlightened
businesses to connect to it using SMTP, NTTP, and FTP clients and
servers to deliver messages, text files, executables, and source
code. The Internet became a more fundamental tool when businesses
started to integrate their corporate information (both public and
private) into the emerging Web framework. The Internet became popular
when it shifted from a focus on transactional protocols to a focus on
data objects and the links between them.
The technologies that characterized the early Web framework were
HTML-GIF/JPEG, HTTP, and URIs. This combination of standardized
formats, a single application protocol, and a single universal
namespace was incredibly powerful. Using these technologies,
corporations integrated their diverse online publishing systems into
something much more compelling than any one of them could have
built.
Once organizations converged on common formats, the HTTP protocol,
and a single addressing scheme, the Web became more than a set of Web
sites. It became the world's most diverse and powerful information
system. Organizations built links between their information and other
people's. Amazing third party applications also weaved the
information togethe; examples include Google, Yahoo, Babelfish, and
Robin Cover's XML citations.
First generation web services are like first generation Internet
connections. They are not integrated with each other and are not
designed so that third parties can easily integrate them in a uniform
way. I think that the next generation will be more like the integrated
Web that arose for online publishing and human-computer
interactions. In fact, I believe that second generation web services
will actually build much more heavily on the architecture that made
the Web work, using the holy trinity: standardized formats (XML
vocabularies), a standardized application protocol, and a single URI
namespace.
This next generation of web services will likely adhere to an
architectural style called REST, the underlying architectural model of
the current Web. It stands for "representational state transfer". Roy
Fielding of eBuilt created the name in his PhD
dissertation. Recently, Mark Baker of Planetfred has been a
leading advocate of this architecture.
REST explains why the Web has URIs, HTTP, HTML, JavaScript, and
many other features. It has many aspects and I would not claim to
understand it in detail. In this article, I'm going to focus on the
aspects that are most interesting to XML users and developers.
The Current Generation
SOAP was originally intended to be a cross-Internet form of DCOM or
CORBA. The name of an early SOAP-like technology was "WebBroker" --
Web-based object broker. It made perfect sense to model an
inter-application protocol on DCOM, CORBA, RMI etc. because they were
the current models for solving inter-application interoperability
problems.
These technologies achieved only limited success before they
adapted for the Web. Some believe that the problem was that Microsoft
and the OMG supporters could not get along. I disagree. There is a
deeper issue. RPC models are great for closed-world problems. A closed
world problem is one where you know all of the users, you can share a
data model with them, and you can all communicate directly as to your
needs. Evolution is comparatively easy in such an environment: you
just tell everybody that the RPC API is going to change on such and
such a date and perhaps you have some changeover period to avoid
downtime. When you want to integrate a new system you do so by
building a point-to-point integration.
On the other hand, when your user base is too large to communicate
coherently you need a different strategy. You need a pre-arranged
framework that allows for evolution on both the client and server
sides. You need to depend less on a shared, global understanding of
the rights and responsibilities of a participant. You need to put in
hooks where compliant clients and serves can innovate without
contacting you. You need to leave in explicit mechanisms for
interoperating with systems that do not have the same API. RPC
protocols are usually poorly suited for this kind of
evolution. Changing interfaces tends to be extremely
difficult. Integrating services typically takes complicated software
"glue".
I believe this is the reason no enterprise has ever successfully
unified all of their systems with DCOM, CORBA, or RMI.
Now we come to the crux of the problem: SOAP RPC is DCOM for the
Internet.
There are many problems that can be solved with an RPC
methodology. But I believe that the biggest, hairiest problems will
require a model that allows for independent evolution of clients,
servers, and intermediaries. It is important, then, for us to study
the only distributed applications to ever scale to the size of the
Internet.
The Archetypal Scalable Application
The two most massively scalable, radically interoperable,
distributed applications in the world today are the Web and
email. What makes these two so scalable and interoperable? They depend
on standardized, extensible message formats (HTML and MIME). They
depend on standardized, extensible application protocols (HTTP and
SMTP). But I believe that the most important thing is that each has a
standardized, extensible, global addressing scheme.
There's an old real estate joke that the only three things which
make a property valuable are location, location, and location. The
same is true in the world of XML web services. Properly implemented,
XML web services allow you assign addresses to data objects so that
they may be located for sharing or modification.
In particular, the web's central concept is a single unifying
namespace of URIs. URIs allow the dense web of links that make the Web
worth using. They bind the Web into a single mega-application.

Does the REST model make sense, or is SOAP enabling something that couldn't have happened otherwise?
Post your comments
URIs identify resources. Resources are conceptual
objects. Representations of them are delivered across the web in HTTP
messages. These ideas are so simple and yet they are profoundly
powerful and demonstrably successful. URIs are extremely loosely
coupled. You can even pass a URI from one "system" to another using a
piece of paper and OCR. URIs are late bound. They do not declare what
can or should be done with the information they reference. It is
because they are so radically "loose" and "late" that they scale to
the level of the Web.
Unfortunately, most of us do not think of web services in these
terms. Rather we think of them in terms of remote procedure calls
between endpoints that represent software components. That's CORBA,
DCOM thinking. Web thinking is organizing around URIs for
resources.
Claim: The next generation of web services will use
individual data objects as endpoints. Software component boundaries
will be invisible and irrelevant.
[1] [2] [3] Next