Building the Perfect Beast
Dreams of a Grand Unified Protocol
Rohit Khare // March 5, 1999
IEEE Internet Computing // Seventh Heaven
Seventh Heaven was inaugurated one year ago to chart the evolution of Internet application protocols. Perched upon the heights of the network stack, we paged back through history to trace the lineage of Telnet, FTP, SMTP, NNTP, Gopher, and HTTP. With Jon Postel's fingerprints to guide us, the arc of thirty years of Transfer Protocol (TP) development brought us to the present day -- where the pages of our history books go blank.
Leaving us, gentle reader, with but one option for our second season: to turn from documenting the past to divining the future. After all, unlike designs past, there are infinite number of designs yet-to-be: XML-RPC, HTTP-NG, DIAMETER, IMPP, SIP, ISEN, SWAP, WAP, SDMI, and every other acrynomious concoction we can get our hands on(*). And that's to say nothing of how protocols are actually adopted and history actually gets written: the mythical eighth (Economic) and ninth (Political) layers of the stack. These are the kinds of new topics we'll tackle in 1999: new protocols, economic models, and standardization processes.
(*)[Footnote: For the record, that's Remote Procedure Calls in Extensible Markup Language, HTTP-Next Generation, an extensible successor to RADIUS (Remote Dial-In USer) authentication, Instant Messaging and Presence Protocol, Session Invitation Protocol, Internet-Scale Event Notification, Simple Workflow Access Protocol, Wireless Access Protocol, and the Secure Digital Music Initiative.]
Conveniently, this raises a very timely topic to the head of our agenda: the IETF Applications Directorate's debate over the possibility and desirability of an Application Core Protocol (APPLCORE). This latest outbreak of an ancient meme bloomed rapidly, from its first mention in January to a Birds-of-a-Feather session at IETF-44 in Minneapolis by March. The proliferation of new applications derived -- or worse, cut-and-pasted by parts -- from SMTP and IMAP (Interactive Mail Access Protocol) has prompted an interest in documenting the common challenges of such extensions, and perhaps in developing an entirely new set of command building-blocks.
Another variant of the same infectious hope motivates the HTTP-Next Generation project, sponsored by the World Wide Web Consortium with participation from Xerox and Digital (profiled in detail by Bill Janssen last issue). From their vantage point, the Grail looks like a programmable distributed object protocol, atop an efficient new multiplexing transport. They look ahead to interoperating with Java RMI, CORBA IIOP, and RPC protocols.
And of course, HTTP/1.1 itself has proven flexible enough for a host of communities to (ab)use it to their own ends. From distributed authoring and versioning (WebDAV) to the Internet Printing Protocol (IPP), the freedom to define new methods, headers, and Internet Media Types at will has led to at least three schools of thought on extending the hypertext Web to other information transfer problems. Its proposed Mandatory extension mechanism aims to rationalize the patchwork of existing rules.
These three camps -- Mail, Web, and Objects -- are each vying to establish a new order for Application Layer Protocol design. The quest for universal applicability, though, crosses several common fault lines in the design space. Each camp tends towards different tradeoffs between human- and machine-readability, command-oriented and stateless transactions, end-to-end and proxiable connections, inspectability and security, and whether to exploit the unique properties of UDP, multicast, broadcast, and anycast transmission.
While it may seem quixotic to hope for a Grand Unified Protocol, part of the IETF ethos is that you never know unless you try. However slim the opportunity for clean reengineering, it is still appears worthwhile to each camp to seek out an "application-layer TCP," a second neck in the Internet hourglass.
Or, as mail protocol and security maven Chris Newman put it when calling for APPLCORE in the first pace, "What protocol facilities are common to most of FTP, HTTP, IMAP, LDAP, NNTP, POP, SMTP/ESMTP, Telnet and our other successful protocols? I can't predict what a careful study of IETF protocol history and comparison of candidate solutions will suggest." Since that's the very mission statement of our column, let's take a shot at evaluating these camps' prospects
Diversity vs. the Melting Pot
While "transport" protocol merely deliver piles of bits, TPs need to 'tag-and-bag' their information, almost invariably using MIME syntax. IETF's eventual success in hammering out a standard for packaging information objects is a powerful inspiration for today's protocol unification.
It's not much of a stretch to think of today's TPs as merely different strategies for delivering MIME objects -- real-time vs batch, push vs. pull, end-to-end vs. relayed, reliable vs. unreliable, and so on. Over the past year, this column has fleshed out a model of Transfer Protocols (TPs), as summarized in Table 1. We classify three aspects of TPs: addresses that identify participating nodes; distribution rules controlling transfers; and the message formats on the wire.
Before proposing a universal solution by picking winners and losers, it behooves us to investigate the tensions and tradeoffs that motivated such diverse solutions in the first place.
Engineering for Humans or Machines?
While all of these protocols are executed by machines, they are created by programmers -- two actors with rather different priorities for standardization. Will the messages be binary (and faster for computers to parse) or English text (and easier for humans to debug and extend)? Will the application semantics be precisely defined, a la RPC, or open to extensible interpretation, like HTTP?
Syntax. Two archetypal alternatives are RFC822 header fields and X Window System events. Email, news, and web headers from any information service are reasonably easy to understand, at the cost of marshaling complex data structures like dates and numbers as ASCII text and parsing all sorts of variations of case, line folding, comments, character sets and other gremlins to uphold the maxim "be liberal in what you accept". X Protocol messages are tightly packed binary data aligned on machine boundaries -- and can be processed several orders of magnitude more efficiently because the control flow through a server is clearly delineated by X Window System semantics and a strict extension mechanism.
Furthermore, both alternatives are relatively brittle. Internationalizing header text in various human languages is notoriously difficult because of the number of techniques available (RFC 2277, "IETF Policy on Character Sets and Languages"). IETF also has some (exasperating) experience with ISO's Abstract Syntax Notation (ASN.1) self-describing binary marshaling format for SNMP and X.500 public-key certificates. The tools for compiling such data structures are expensive and still can't guarantee interoperability amongst all the possible Encoding Rules (Basic, Defined, Text, ).
The silver bullet of the moment is XML. Roll-your-own tagsets offer some hope of jointly guaranteeing machine-readable validity and human-readable message text. As WebDAV encountered, using XML in a wire protocol has its own costs: complex machinery for XML text encoding ("entities"), whitespace, namespaces, and bloated message text. And for all that, there's still no standard data-formatting rules yet for basic data types like dates, numbers, arrays, and so on.
The ability to simulate a protocol exchange with Telnet is another litmus test whether it's catering to machines or humans. If the application itself expects to be supported by individual users and system administrators, it's useful to experiment and debug the system with a bare-minimum tool available on almost any host. If it's a core OS service supported by the vendor, then it's reasonable to expect libraries for inspecting binary messages and APIs to support it.
Semantics. Another way of justifying the "Telnet test" is to estimate how many independent implementations are expected. While there are only a few X servers and client libraries -- including definitive editions from the X Consortium -- there are uncountable little HTTP-driven scripts and hacked applications babbling away in pidgin dialects developers learned by imitating other clients and servers. The smaller the community of developers, the greater the likelihood of establishing a common ontology. A system as multilateral as the Web dilutes the semantics of "GET" to the point it could be opening a database, running a Turing-complete program, instructing a robot, or any other process that eventually generates a MIME entity.
Turn the argument on its head, and it illuminates the tension between APIs and protocols. If there is a suite of clearly defined operations, there's more ease-of-reuse by standardizing programming interfaces ("the Microsoft way," Generic Security Services (GSSAPI, RFC 1508)). If implementations are expected to diverge, just focus on the bytes-on-the-wire and the message sequence ("the IETF way," Simple Authentication and Security Layer (SASL, RFC 2222)).
It's the Latency, Stupid!
Performance constraints also vary with human-driven or machine-driven transactions. Interactive use must minimize latency, leading to stateless protocols, while batched server-to-server communications can optimize bandwidth utilization with stateful command protocols.
In the messaging arena, POP and IMAP provide client-server access, while SMTP and NNTP relay between stores. POP optimizes latency by selectively listing headers and bulk data separately; IMAP further offers concurrent operations. SMTP and NNTP, though, operate in modal lockstep, pipelining transmission and reception of new messages.
On the Web, last issue's column suggested that the earliest HTTP/0.9 spec was even less powerful than FTP -- but it had a critical advantage of lower latency. While FTP requires separate commands (and round-trips) to login, authenticate, navigate to a path, and request transmission, an HTTP download begins within one round-trip after connection establishment. That's because its request message encapsulates all those commands in a single message (URLs for path and filename; headers for authentication information and media-type) -- and each request can be processed on its own, without reference to the state of a connection.
As the 'bandwidth-delay product' gets larger and larger for wireless, fiber, and satellite links, stateless protocols become more valuable -- it takes less time to pickle the added state information and transmit it than to wait for a command to complete.
End-to-end support is another contentious design decision facing our conquering heroes. Stateful command sequences are harder to proxy through firewalls or cache, but can manage concurrency explicitly. Store-and-forward messaging (TPs) typically operate across a chain of relays, in contrast to interactive query and access protocols (APs) directly connecting clients and servers.
Consider, then, the Calendaring and Scheduling WG's dilemma. Their requirements included connected and disconnected operation, queries against multiple stores, and low-bandwidth operation (draft-ietf-calsch-capreq-02.txt). There are some arguments for modeling a calendar as a Web resources, and operations upon it as HTTP transactions. In particular, the DAV and DAV Searching and Locating (DASL) extensions cover some of their requirement space. It's unclear, though, if the HTTP caching model is sufficient for nomadic use, though. An access protocol like IMAP has richer support for reflecting local actions immediately, chains of actions, and conflict resolution, but at the expense of custom development (about four years in this case).
Concurrency is also represented differently in each school. An AP can tag its requests by ID and thus reshuffle its responses to several outstanding operations in the same session. APs like SMTP and NNTP can also "turn" the connection around to make requests in the opposite direction. Stateless TPs typically require synchronous response, tacitly pushing concurrency control -- the timing and priority of responses -- to the transport layer. Thus, Web browsers that open multiple TCP connections; and the Message Multiplexing (MEMUX) effort within HTTP-NG.
Unlike end-to-end APs, proxying permits intermediaries to offer sophisticated services, from caching to Japanese translation. While HTTP cannot explicitly model the side-effects of a chain of operations (does the server reply to the outstanding GET or PUT first?), its statelessness does enable a rich caching model (I don't care as long as the reply's no older than five minutes).
Firewalls are another kind of proxy. Network administrators have a right to inspect the contents of Internet connections. Today's baseline is filtering services by TCP port number, but as more and more applications attempt to extend HTTP or other APs, the port number isn't precise enough to enable or disable specific services. It's intellectually dishonest to use an existing, popular protocol as mere transport "because it gets through firewalls" -- they're there for a reason.
For example, one totally flexible universal protocol is to map remote procedure calls to HTTP, whether in XML or directly coding, say Distributed COM (shipped with NT5, though thankfully off by default). If Web traffic could now conceal information leaking out or hackers coming in
Even multiplexing makes firewalls more resource-intensive. One TCP connection running MEMUX could have several subchannels, each of which must be judged individually.
End-to-end encryption of an AP is another tough scenario: while a TP proxy might be able to decrypt and then forward an entire information object, it can be a violation of the standard to interpose a proxy in the middle of a stateful conversation.
Security is more than transport-layer encryption alone. There are far too many application-layer Authentication and Authorization schemes. SASL succeeded by providing a simple building block -- a sequence of challenge-response messages and status codes -- that allow developers to mix-and-match authentication algorithms.
Making the Medium the Message
The layers below also offer application designers unique capabilities often overlooked. Telnet, for example, relied on TCP's urgent data and interrupt facilities, but HTTP serenely floats atop any 8-bit clean channel (even half-duplex!). Since there are so few applications that take advantage of broadcast, multicast, and anycast semantics, it's hard to plan ahead for a core protocol that could bridge those modes. And even though most services reserve TCP and UDP ports, datagrams are typically used for "small enough" messages -- it's the rare protocol that intelligently copes with lost packets.
Link-layers also affect the evolution of application protocols. The Wireless Access Protocol suite is founded on a belief that every Internet layer must be reinvented for the cellular environment. Very-low-bandwidth and very-high-latency environments call for compact message encoding and pipelining, among other features.
Camp the first: Mail
The initial call for APPLCORE came from folks in the email and directory communities. A strawman proposal like Application Core Protocol (draft-earhart-acp-spec) can trace its heritage back to Postel's SMTP and FTP state machines and his theory of reply codes. It defines a framework for transitioning to an authenticated connection, issuing commands, and receiving interleaved, tagged responses a la IMAP.
While it's easy to imagine, say, NNTP's current-group and current-article pointers and commands in this vein, the APPLCORE charter per se does not call for a stateful solution. It's just a coincidence that its author believes the proposed WG should " focus on a single "core protocol" based on the connection-based stateful client-server structure that most successful IETF application protocols follow."
Camp the second: Web
HTTP proponents would beg to differ at Chris Newman's imprecation that it's one of the "protocol models with which we have far less experience research problems and thus out-of-scope." They can point at a handful of significant, standardized extension packages for HTTP/1.1 -- and a massive set of informal experiments. Its stateless, textual model is particularly hackable, even by shell scripts and programs without any internal model of the Web.
HTTP/1.1 enables all this by permitting new methods, header fields, and content types. WebDAV, for example, modified the PUT method with lock fields; added MOVE and COPY (limited by a Depth: header); and manipulates metadata with XML-coded requests and responses.
Complexity is the natural consequence of such freedom. Consider how byte-ranges, the ability to request only a portion of the expected response (typically, for partial rendering, e.g. a single page of a PDF file), interact with all other possible extensions.
This community's best solution is the Mandatory extension mechanism (draft-frystyk-http-mandatory), which provides a namespace to discover more about an extension, and switches to indicate whether to succeed or fail if an extension isn't recognized. This permits multiple extensions to interoperate by marking off their own method names, header fields, and error codes. Extensions are identified by URIs and marked as Mandatory or Optional obligations on the next server (proxy) or only the origin server.
The risk of continuing to subsume more application services within HTTP is the tendency to abuse it as an opaque container for their own traffic. One of its authors, Roy Fielding, declaimed: "The Web uses HTTP as a transfer protocol, not a transport protocol. HTTP includes application semantics and any application that conforms to those semantics while using HTTP is also using it as a transfer protocol. Those that don't are using it as a transport protocol, which is just a waste of bytes."
Camp the third: Objects
The third tack is to directly model Internet applications as distributed programs, and the client-server protocol follows as a consequence of the API. The IETF's standards-track Remote Procedure Call v2 specification (RFC 1831) was issued four years ago, based on fifteen years' experience at Sun and elsewhere. The popularity of CORBA and Java RMI interfaces underscores the power of this approach.
So while the Web camp could be said to provide OO interfaces through document transfer (FORMs and TABLEs and so on), the HTTP-NG effort aims to provide document transfer over an OO interface. Its three layers begin with MEMUX to optimize transmission for the Web by composing compression, encryption, and other services for its virtual channels; then a messaging layer that can marshal binary request and response messages; and its own upper layer for services like The Classic Web Application (TCWA), WebDAV, printing, and so on.
It's certainly appealing to reuse application protocols by subclassing and adding your unique functionality, then composing it with other off-the-shelf modules for security and performance. But Chris Newman invokes the conventional wisdom at the IETF: "RPC mechanisms are a poor choice in general for standards-based protocols. It's much harder to design an extensible and simple API than it is to design an extensible and simple wire protocol."
So with each camp set against the others, how likely is any compromise core protocol -- or one camp's hegemony? The answers are not at layer 7 alone: we must climb higher. Technical analysis alone does not point to a clear intersection of services, nor justifies any protocol's universal utility.
Layer 8: Economics
First, investigate whose scarce resources are being reallocated. "Reuse" claims litter the 120+ message archive of the debate (http://lists.w3.org/Archives/Public/ietf-discuss/) -- but whose efforts are being reused?
There are at least three plausible actors: programmers, spec writers, and standards committees. The first group doesn't benefit until there's a single core protocol on the wire and a code library -- and then only for developing 'multi-protocol' tools. The second benefits by consulting the accumulated wisdom of APPLCORE to navigate the archives and cite prior art. The third benefits from rigorous modeling of the design alternatives and a framework to classify inter-extension dependencies in order to judge which protocols ought to be blessed.
Now we can identify institutions that would be motivated to invest in this effort. Look for vendors who implement lots of protocols within a single product. Microsoft, for example, promotes a single "Internet Information Server" with the vision of vending the same information by HTTP, DAV, and other protocols. Netscape has a new app for every protocol. Apache, focused purely on HTTP, has little interest in implementing printing, conference calls, or other whiz-bang applications. In the second group are authors with experience from multiple working groups. Such IETF luminaries are leading each camp, respective examples such as Chris Newman, Henrik Frystyk Nielsen, and Mike Spreitzer.
Layer 9: Politics
To identify the third class of actors, though, requires understanding motivations at the highest level: Politics. The standardization process is an institutional struggle, in this case to establish order within the IETF Directorates, and to defend its turf from ISO and other bodies.
APPLCORE intends to "go only in directions that at least two successful IETF protocols have gone," Newman claims. This extends the "rough consensus" maxim to bless reuse of solutions as guidance to spec developers. But to envision and APPLCORE robust enough to measure which protocols are worthy or not is another order entirely -- and one the IESG may not support.
For example, SG member Brian Carpenter commented "that approach could imply that HTTPng is the basis for all future applications protocols I doubt that the IETF is ready to make that step." There are similar interests against defining the Internet as mail-like or web-like.
Instead, if no dominant camp emerges, a compromises which aims for their intersection raises the scepter of another bogeyman: ISO. IETF culture alternately derides for lowest-common-denominator designs with lots of options required for the useful cases; and for wasteful layering. Charges of ISOism have been leveled by all sides in this debate. Grand Unification Theories from the Object Management Group (OMG) don't go over well, either.
All of these considerations feed into the decision whether IETF even charters a WG on this topic. Without viable candidate protocols, it smacks of design-by-committee -- which triggers quite an allergic reaction in this community.
But if -- just suppose -- APPLCORE succeeded, what a wonderful world it would be! An application designer would only have to focus on understanding the underlying interaction pattern. Leave it to this bit of middleware to choose the right addressing system, syntax, and distribution algorithm in order to get the right bag of bits to the right people by the right time. Internet application design seem more like "rational drug discovery," adapting a protocol tuned to its requirements rather than merely chasing popular implementations and patching them to match.
We expect new kinds of hybrid applications: Event notification with server-initiated delivery, not just polling. Smart pages that represent live processes. Multiprotocol message archives accessible over the web, on the phone, as a fax, by e-mail
Sure, it would only be a 'good enough,' but a single interface would be immensely valuable. TCP isn't ideally tuned for many applications, but its ubiquity has a quality all its own. Its twenty-year dominance has relied on technical ingenuity to manage its evolution and explicit political commitment. We have IP for packets, TCP for streams, and a clear vacancy for a message middleware standard.
The Perfect Beast
Realistically, though, we suspect this outbreak of the Perfect Beast meme will only take us as far as cataloging design patterns of Internet application layer protocols. Thirty years' experience is long enough to write a textbook, even if the ultimate solution isn't at hand. We can capture common solutions to common problems along with the rationale. Guidance like "measure compatibility with capability lists rather than version numbers" would greatly reduce design-time costs -- and guide repairs to today's protocols.
It's instructive to reflect upon other outbreaks. Mightn't the search for One True Application Protocol lead to the same cul-de-sac as the One True Programming Language? That community has also played a game of turtles: whether functional, imperative, or procedural ought to be the most fundamental representation. LISP has been the language of the future since 1959 -- even as it has grown to accommodate procedures (Scheme), objects (CLOS), and a vast array of time-tested tools (the Common Lisp environment). The risks of sprawling unification were incoherence, poor performance, and refragmentation into sublanguages.
There is a fundamental phenomenon at work: ontology recapitulates community. Ask how many people need to understand a given sentence? whether source code or protocol message. That's how universal the solution needs to be. If we want every Internet-accessible resources in a global hypermedia system, you cant expect more than a snapshot representation; if you expect to manipulate a datebook to automatically schedule a meeting, then only special-purpose calendaring tools will be cognizant of other conflicts.
All we can hope for is to make the common case easy and keep the hard stuff possibleRohit Khare is a graduate student in the WebSoft group at the University of California, Irvine; and a principal of 4K Associates, a standards strategy practice.
[Word count:~4,000..Note to the editors: feel free to elide or substitute the intra-document headings Note: Table 1 can be copied directly from the March/April 1998 issue's Table 1, with a suitable change of tense in the caption.]
Text / Binary Files
822 + MIME
822 + MIME
822 + MIME + HTTP caching