Message-Oriented Middleware and the Software Engineer

Rohit Khare,, University of California at Irvine, 25 October 1999


Software engineers coordinating distributed systems need a principled strategy to select messaging systems that reinforce desired system qualities. In this paper, I present a broad characterization of Internet-scale messaging systems; and a characterization of 'ilities' which are promoted or inhibited by aspects of messaging systems. These design aids will help make messaging technology more accessible to software designers and will encourage software engineers to reuse existing messaging systems rather than custom-built client-server protocols.


The Payroll 2000 Project is approaching a critical integration milestone. The Tax & Benefits team is putting the final touches on their multivariate, multijurisdictional pay-compression engine in Capital City. The Prettyprinting team has been running test data against their test printers all week in Podunk, fitting the United Way pledge box up against the bottom 1/32" of the form just so. Today is the Golden Spike: the former will crunch the test payroll and email the results to the prettyprinter and soon the lasers will be bonding magnetic ink to security-watermarked checks.

And... nothing.

Frantic debugging. The freshly compressed pay packets had gone out, all right. They were just stuck in the email queue of a puny departmental server straining under the load of incoming spam. Their critical business messages had gotten interleaved with junk mail.

Well, all right, let's try another messaging system: UDP (User Datagram Protocol). We'll go straight to the bare metal and there won't be any problem. Each pay packet will go across as a UDP packet to the printers. And lo and behold, they did much better than before. Now they could print 97% of the pay checks -- some fraction just vanished into the ether(net).

This is silly. The Web can solve all these problems, let's just fetch each paycheck from a Web server in Capital City. Now the printers will pull each stub, and we can be sure every last employee gets paid... slowly. The Web solution seems to take forever to open a new connection for each check, slowing down the printers. And for good measure, the office prankster has hooked up a convenient browser interface for any employee to spy on any other employee's take-home pay.

In truth, the Payroll 2000 Project is actually a pretty well-managed ship. The project was broken into subcomponents; development teams produced accurate software on time and on budget; and they're even using this leading edge Internet stuff to wire it all together. All their usual software engineering nostrums for building inside the black boxes worked fine.

Composing them, though, requires coordination through a messaging system. This floundering team didn't have an accurate understanding of the characteristics of its messaging systems nor how those characteristics impacted their desired 'ilities': efficiency, reliability, security, testability, and fault-tolerance.

Messages are a key abstraction for integrating commercial systems, especially at Internet-scale. Datagrams, streams, email, file transfers, Web pages -- how can software engineers select from a welter of competing TPs?

In this paper, I will first provide a broad characterization of several, quite different Internet messaging systems: UDP, TCP, FTP (files), SMTP (email), NNTP (news), and HTTP (web) according to their message formats, addressing, and distribution algorithms. Furthermore, I hope to distill some broader characterizations about which kinds of message middleware will support various 'ilities.'

Characterizing Internet Messaging Systems

Generically, a message-oriented middleware layer accepts an envelope of bits to be transferred to some (set of) endpoint(s).In my model, there is further variation along three axes: the format of the artifact (bits), the namespace of endpoints, and the transfer algorithm. In a reversal of the genre, I will detail my taxonomy before proceeding to detail some of the particular protocols.

The format of the actual artifact -- the so-called 'bits on the wire' -- is responsive to more abstract, application-layer intent. Is the message intended to invoke a single remote procedure call? a multimegabyte batch process? a video stream? Size is one obvious gross differentiator of protocols' capabilities. Lifetime -- how long the message is expected to remain visible -- is another vital statistic. For large, longer-lived messages, the format may explicitly support caching, through some sort of freshness indicator. There may also be structure information in the 'payload' itself: metadata regarding type, origin, routing traces, etc. Finally, I've been assuming asynchronous delivery of complete artifacts; if it's materializing incrementally or in real-time, there must be streamable formats, too.

The namespace of a network protocol's endpoints are usually determined by its layering in the network stack -- ethernet IDs at the link layer, IP at the network layer, e-mail user ids at the application layer, and so on. Most of the transfer protocols considered here, though, are at the application layer, and thus admit a wide variety of namespaces. We shall concern ourselves with the syntax for endpoint names and addresses, with particular attention to their administrative controls. The namespace of artifacts within the transfer protocol support yet another repertoire of operations. Some artifact addresses may be forwardable; that is, equivalent even when mirrored through another proxy or redirected to another artifact.

Finally, the actual transfer algorithms respond to broad topological constraints. Flood-fill netnews feeds, for example, work even amongst disconnected peers, while web page fetches require directly connected clients of a central server. Initiation is a key differentiator between those two: push (sender-initiated) vs. pull (receiver-initiated) protocols. We also characterize protocols by their intended reliability and latency.

The following table briefly outlines five Transfer Protocol (TP) choices for names and addresses; connection topology, delivery order, and initiation; and payload message formats.

Transfer Protocol






Message Format














Bytestream w/interrupts











Text / Binary Files











822 + MIME











822 + MIME





URL Path






822 + MIME + HTTP caching

Transport-Layer Protocols

I'll begin by comparing two transport-layer foundations for the transfer-layer protocols in the remainder of the survey.

UDP, also known as 'Unreliable Datagram Protocol,' is a thin wrapper around raw IP service. While it permits fragmentation and reassembly of datagrams up to 64K in size, the cumulative probability of packet loss practically limits transmissions to one Maximum Transmission Unit (MTU), typically limited to 1500 bytes across an Ethernet segment. It is well suited for three scenarios: broadcasting or anycasting on a LAN, lossy isochronous media, and very large client/server ratios. Respective examples include: bootstrapping queries for configuration data, RealAudio, and Domain Name Service (DNS). In each case, UDP's defining advantage is its 'connectionless' algorithm: a shot in the dark doesn't require an initial setup with the recipient.

Its complement is TCP (Transmission Control Protocol), which establishes a reliable, full-duplex stream. (The real magic is in how it adapts transmission rates without explicitly signaling congestion in the network, but that machinery is not relevant to this analysis.) TCP's connection-oriented delivery algorithm incurs an initial delay for a 3-way handshake. Acknowledging each received segment also consumes bandwidth and state at each end (as a consequence, busy web servers typically hold thousands of half-open TCP connections stuck in TIME_WAIT state). Both transport protocols use IP addresses as the namespace for endpoints.

Transfer-Layer Protocols

Telnet is the least structured way to deliver messages over a TCP connection. Using the abstraction of a Network Virtual Terminal (NVT), it can connect two processes character-by-character or line-by-line. An online library catalog like MELVYL can be viewed as a remote database lookup protocol with synchronous query and result messages.

Aside from transmitting text in a variety of character sets, the 'format' of a Telnet transmission includes access to lower-layer URGent delivery. In order to effectively mask the latency of remote login, keystrokes like 'abort process' must be able to preempt normal streaming delivery. The actual size and shape of the entities transferred over it can be arbitrary, though, as the Gopher protocol demonstrated. In fact, SMTP and FTP are also built atop Telnet's NVT.

Telnet is the Swiss Army Knife of the Internet because its namespace for endpoints includes the port number along with the target host address. In conjunction with the IANA-registered list of well-known ports, it can be redirected to debug a wide variety of Internet services, from Finger to HTTP.

Its distribution algorithm, though, is the most naïve of the TPs. It's a client-initiated, end-to-end, synchronous channel, without any intermediate cachability or proxiability Its main strength is its lack of application-specific roles; if firewalls aren't in the way, Telnet can be used just as effectively to push information from central 'servers' as pull it. It's the philosophy behind old-school UNIX tools such as talk.

FTP (File Transfer Protocol) was the next building block developed. Now the payload arrives in discrete chunks, complete with filenames and directory labels. It could handle both text and binary ('image') files, the latter in a bewildering array of configurable byte widths and encodings. It even included a restart facility for resuming interrupted transfers at a specified marker point in the content. It wasn't as appropriate for 'live' streaming data transfers, though.

The Network Virtual File System (NVFS) had its own challenges bridging dramatically different FS designs in the early days. Even the concept of 'directory', much less 'parent directory' name was up for grabs. Furthermore, once transferred, the files took on local names, shorn of any permanent metadata tracing it back to its original source. On the other hand, a manual redirection and caching infrastructure emerged through mirror servers, which transplanted one server's file store under a directory on another.

At the transport layer, though, FTP remained a client-initiated end-to-end protocol. The main twist was that each actual file transfer took place on a separate data channel; the control channel exchanged commands reserving a temporary port number for delivering the data. Among other reasons, TCP connection close could then be used to signal end-of-file. Since FTP could also upload files, though, it is our first formally recognizable message transfer protocol.

Email followed closely on its heels, in the guise of SMTP (Simple Mail Transfer Protocol). At this point, messages develop internal structure, dividing into headers and body (RFC 822, Standard for the Format of ARPA Internet Text Messages). Eventually, the richer MIME metadata (Multipurpose Internet Mail Extensions) would be used to indicate national character sets, multimedia content types, and security properties.

Message-ID was one of the original headers, establishing a new, globally unique namespace for every single email (and later, netnews) message. More to the point, SMTP ratified Ray Tomlinson's 1972 brainstorm for mailbox identification: user@host. This was all well and good when the ARPAnet was designed for 26 hosts, but once the new Domain Name System was in place, Mail Exchanger (MX) records were used to delegate all mail handling for an entire domain. In the process, SMTP also lost vestigial features such as 'SEND', which could immediately deliver a message to a logged-in user's terminal.

New protocols were required for the "last-mile", too. POP and IMAP are interactive access tools for reading mail messages. SMTP has been recast as the language of Mail Transfer Agents (MTAs), responsible for the assured queuing and handoff of messages across the Internet. By tracing the Received headers and message-IDs, MTAs can try to deliver a message for days, patiently ignoring temporary server failures and network congestion. The resulting distribution topology can push email from one-to-many, including mailbox forwarding.

A decade after that, in turn, the social structure of group email lists was reified as netnews, primarily in the form of NNTP (Network News Transfer Protocol). News messages are essentially identical to email message, though MIME extensions are less common because of the low joint probability that enough readers will have MIME tools; multimedia has proven easier to initiate pairwise.

The Newsgroups header is the main difference, though. Even though the sender address remains in mailbox format, the destination is a (set of) hierarchically named newsgroups. Like any good Internet-scale namespace, USENET has survived by delegation to varying levels of official authority. The alt. prefix and .moderated and .d suffixes are examples of that.

NNTP's new distribution protocol is its raison d'etre: a mesh of news spools connect peer-to-peer and exchange any message the other hasn't seen yet. This flood-fill algorithm is controlled by the Message-ID fingerprint and Expires deadline, as well as 'semantic' filters by subscription (e.g. comp.*) locale (ba, for Bay Area). The effective latency ranges up to several days to propagate across the USENET.

An interesting footnote to NNTP's development is its use as a single-spool protocol. Much as earlier, more efficient instant messaging protocols like IRC (Internet Relay Chat) have been superceded by massive central systems (e.g. AOL Instant Messenger), many limited-interest newsgroups are never distributed to other spools at all. The microsoft.* groups, for example, are only available by directly connecting to Microsoft's server.

In the '90s yet another species of message transfer service emerged, HTTP (HyperText Transfer Protocol). In its modern form, it extended the MIME mail structure to deliver Web pages in a new HTML content-type. It also added richer cache-control information bounding the freshness and validity of a given HTTP response. Finally, it could accommodate streaming content delivery after a fashion, using TCP connection close or known-length chunking to indicate EOF.

The Web's central contribution, of course, was its namespace. Since every HTTP entity was assigned an URL in the Content-Location header, browser and server could navigate the same local namespace. Compound documents, in particular, relied on relative URL addressing to fetch its constituent parts. Furthermore, other parties (caching proxy servers) could deliver named entities, just as many USENET news spools could resolve the same Message-ID. Names could also be easily redirected into other hypertext links.

This speaks to the flexibility of the underlying protocol. While HTTP remains an end-to-end, one-to-one, synchronous request-response access protocol, it can be proxied, mirrored, cached, and gatewayed to a host of other information services. We also see hybrid usage patterns, with push email delivery of URLs and voluntary pull Web delivery of the actual data. The effective scope is a one-to-many global information store.

Towards a richer taxonomy

With our brief introduction to five major Transfer-Layer protocols in mind, we can flesh out our model in greater detail. In particular, the major variable in message format is its handling of metadata, since most can be stretched to transfer any byte-string as payload. Our understanding of names and addresses is refined to map to the message and the service endpoints, respectively. Finally, we discuss some of the nuances of synchronizationand initiation for distribution algorithms.

Meatdata & Metadata

Messages are where the rubber meets the road in TP design. First, most TPs are designed with a particular type of content in mind, and optimized to that end. Second, the elemental Protocol Data Units (PDUs) in these systems are messages that combine an envelope, the command and metadata about the entity, and a concrete representation of the entity itself.

Message content affects the choice of transport in TPs: login emulation with Telnet requires the URGent delivery flag in TCP segments; delivery of open-ended streaming data motivates a separate TCP connection for each HTTP/1.x operation; and time-critical multimedia content may use UDP datagrams directly.

The bytes on the wire of a message usually combine the commands of the distribution algorithm and the contents (FTP, though, separates its control and data channels). Messages combine the 'meatdata', a snapshot of the resource itself, and metadata. Typically, the source and destination addresses and transaction logs describe the command; while content-type, content-name, and content-lifetime information (for caching) describe the entity. Traditional Remote Procedure Call (RPC) and distributed-object protocols also exchange lightweight messages, but such caching and reflection support has been unknown to date. Furthermore, advanced TPs also have distribution commands which interact directly with metadata, as with HTTP's content-negotiation and cache-validation GETs.

Names & Addresses

The first things a TP must define are its addresses for the nodes it transfer messages between, and the names for the entities it transfers. For many TPs, the endpoints are network interfaces on the connected Internet, so they use domain names or IP addresses directly. Sometimes, the endpoints are logical concepts, like individual users' mailboxes or globally-distributed newsgroups.  The entities they purport to transfer have to be represented as messages, so their names identify the message at hand. Names are more intimately tied to the semantics of the service:FTP and HTTP use pathnames to look up resources, while SMTP and NNTP refer to messages by their globally-unique RFC-822 Message-IDs.

Sometimes there are additional relations between names and adresses. An  HTTP caching proxy uses entity-names that include the original host address (e.g. http://cache/http://.../...). Similarly, an FTP mirror site can maintain a copy of the same file at a different host, possibly with a derived name (e.g. a prefix like /mirrors/sumex-aim/...). However, after fetching a file, its name is the local filesystem pathname; FTP does not use metadata to bind it to its "original" name. (MacOS, though, offers a neater solution: every file includes 'comment' metadata, so many tools put the original pathname there).

Push & Pull

Traditionally, various TPs' distribution rules are seen as their most salient classification. There are two levels of such description, though. At the mechanical level, most are built atop TCP, so it is natural for clients to initiate the process. However, we can speak of a more abstract intent in deployed applications: for senders to push data at their will, or for receivers to pull data at theirs. Traditional Web clients can only fetch data from servers; but FTP service usually allows up- and down-loading, so the net effect can be push or pull (even if it's not readily apparent in browser-based FTP access, which neglects uploading).

Another significant point about the distribution rules are their topologies. Again, while TCP/IP is a point-to-point service, the net effect of message delivery can vary. Telnet is a strictly one-on-one service; but email can provide broadcast messaging with its intermediate relays and multiply-addressed envelopes. Similarly, though HTTP connections are one-to-one, the net effect of publishing a URL by mail or on television is also to broadcast the data. Multicasting message transfer remains a research issue, with only limited deployment of Scalable Reliable Multicast (SRM) techniques.

Finally, the choice of synchronicity determines whether and how TP services can be proxied by other parties. As opposed to simple tunneling, proxy service implies reprocessing of a message by an intermediate node. Synchronous conversations like HTTP can only be chained, while asynchronous handoffs can form an asymmetric loop, as in email routing. Synchronous one-to-many broadcasting can also cause 'ACK implosions', which explains why real-time broadcasting is designed to accommodate loss rather than process acknowledgments.

Mapping onto Software Engineering Properties

The hapless Payroll 2000 team still needs to choose an appropriate message-oriented middleware system. The old, centralized mainframe system used a single batch file. Today, they could use a distributed file system, email the records, pull them off a Web server, or write yet another custom message transfer protocol. In each case, there are different tradeoffs between performance, reliability, security, and a hundred other details.

Performance. The obvious factor determining the performance of message transfer would appear to be bandwidth, the sheer data rate. Most often, though, end-user perceptions of throughput are dominated by latency. Even at the transport layer, we can see that an average 4 kilobyte HTML page only takes 2 milliseconds to transmit on a T1 (1.5Mbps) line, but over 300 milliseconds to finally transit the public Internet.

At the transfer layer, though, we should speak of the latency of the entire message, not just a single link. The average email is just as small as a web page, but it could take days to deliver, while an HTTP server will provide an answer within seconds. On the other hand, the HTTP server could hand you a cached copy that could be months out of date (yes, even with valid caching parameters).

Those two kinds of latency are based on separate causes: distribution and freshness, respectively. Administrative routing of email and news flow to the next "upstream" server or "peer" means that even if one server goes down or only connects to the Internet sporadically, the message can be queued for later delivery, up to some maximum, typically five days for email. NNTP's flood-fill algorithm further blurs "delivery" since an article may take days to propagate around the world.

On the other hand, email and news messages are typically considered read-only; they are always the most-up-to-date content for that Message-ID. Looking up the pathname of a Web page or FTP file could have different results from moment to moment. In general, the freshness bounds of any intermediate cache or mirror -- even one built into a typical web browser -- can skew the results. So the latency between posting an updated version and client detection of that change can be open-ended.

Instant Messaging is a hybrid form. Its presence notifications and chat fragments are often as complex as email messages, but feature end-to-end, virtually synchronous delivery. Overall latency bounds in this medium have also declined as industry economics have changed. Early systems like Internet Relay Chat (IRC) depended on a multi-hop distribution tree much like USENET news to minimize bandwidth, especially internationally. Now that chat users are a prized commercial audience, massive centralized systems from AOL, ICQ, Microsoft, and others use end-to-end protocols to avoid sharing any user data outside their domain.

Reliability. Since each step in mail transfer requires a reliable handoff -- the message must be saved to stable storage before returning "OK" -- its longer latency is seen as a small price to pay for robustness in the face of a flaky Internet. Conversely, isochronous multimedia content minimizes its delay budget by sending "raw" UDP packets and adapting to packet loss at a higher layer, e.g. adaptive hierarchical codecs with forward error correction for the most significant bits. Data loss over a reliable transport is also a source of failure. FTP includes a (now rarely used) restart mechanism in the face of client or server failure.

Reliable error notification is almost as critical as reliable delivery. Early HTTP implementations of PUT, for example, would react to unacceptable file uploads with an immediate error reply and shut down the connection. Since the client was still busily transmitting data, client TCP stacks would process the shutdown first, and the error reply would be lost. Similarly, email's lack of a standard message disposition notification (MDN) means that some delivery problems can go undetected, such as mail routing loops.

Maintainability. In practice, such a loop is traced from the Received: header each mail server appends to the message. As an English-language header stored in a queued text file, it can be easier to deduce what went wrong than with 'more sophisticated' binary, database-driven message queues such as IBM's MQSeries or MSMQ. An similar oft-touted maintainability advantage for text-based Internet protocols is the possibility of "just telnetting" to the server port and manually exercising it -- to the point that modern SMTP servers still maintain online help systems!

Security. While many transfer protocols offer "security," the choice of protocol will still depend on the exact threat model. If the Payroll 2000 team wanted to fetch payroll records from a Personnel Web server, it could use end-to-end Transport Layer Security (TLS, a/k/a SSL3.1) to encrypt the transaction. Letting Personnel push payment orders to the Payroll 2000 team by email, though, increases reliability but exposes the 'cleartext' salary records to any intermediate mail servers, even with TLS protection over each hop. This illustrates the difference between security provided by the message transfer protocol and the message format itself. An email-based solution should use encrypted messages to avoid trusting intermediate adminstrators.

Of course, public-key based security of either sort requires some sort of infrastructure for publishing, validating, and revoking each party's keys. Viewing digital certificates as messages themselves illustrates that the same latency, reliability, and initiation choices for PKI protocols can affect the security of the overall system. If an employee of Personnel is sending a severance check, will it be printed before or after the emailed key-revocation certificate arrives firing that same embezzling clerk?

There are "accidental" security risks to be considered alongside these "essential" ones. Many contemporary developers are squeezing their applications into the HTTP toothpaste tube believing that this will "let them get through firewalls." Just because a TCP connection is destined to port 80 does not make it secure, nor even enforce that it's legal HTTP traffic. Just because installing port-blocking firewalls is a cheap and thus popular option for IT administrators does not mean it can't be replaced by a more powerful application-layer HTTP proxy if 'port 80 abuse' becomes widespread.

Scalability. The sheer number of messages transmitted is only a rough proxy of the scalability factors influencing the choice of protocol. A steady flow of data could be pushed by mail or pulled by FTP easily. HTTP can handle 'flash crowds' better, because of its lower per-transaction latency (GET takes one roundtrip rather than a whole SMTP or FTP 'conversation) and can redirect queries to replicas and caches. The topology of actual use is also influential: if information is destined to several people in the same domain, SMTP and NNTP only transmit it once, where local servers burst it out to local recipients/subscribers. Without caching, HTTP would transfer it N times. But while HTTP scales well to large repositories with infrequent, individual access, it cannot work well in disconnected environments. A nomadic user could more easily scale down its bandwidth usage by batching up email transactions, rather than leaving a connection up for interactive browsing.

Implementations can also be more or less scalable. Typical Web servers and FTP servers can be replicated in a cluster, but only with the assistance of a behind-the-scenes distributed file system for consistency. But while FTP sessions are long and can access the whole repository, each individual HTTP request can be redirected to a single server vending that portion of the repository. Email is even more 'embarassingly parallel', since each node maintains its own queues. Of course, there are 'accidental' issues with respect to scalability as well: the design of HTTP forces the server to close its half of the connection first, leaving each TCP connection in a TIME_WAIT state for up to four minutes after a few milliseconds of use.

We can also speak of 'Internet-Scale' as a property beyond merely large numbers. These are economic, social, and political consequences of Inter-networking across organizations. Consider the sudden popularity of a nonprofit's web site: who bears the cost of the increased load? Here, scalability refers to the ability of other parties to replicate or cache that content. Today's chat systems use a central server and central namespace for identifying users, which may be adequate for up to O(10^8) humans, but swamped by O(10^10) devices. They also scale poorly politically: delivery of Pepsi's messages shouldn't be forced through a Coke-run server; competition can force decentralization.

Usability. It is easy to overlook the user interface implications of protocol choices. The Personnel department of a multinational corporation will require an internationalizable format for its names and addresses. Encoding human names into email addresses requires one set of ad-hoc character set selection rules, while escaping them as URLs requires another. It may be best to isolate them to the 'payload' of a message, so MIME headers can explicitly indicate charset parameters. Of course, transfer systems without MIME metadata will have consequential difficulty with multimedia. Current chat protocols are limited to lines of text, but new proposals for IMPP emphasize MIME-typed payloads in preparation for audio and video snippets.

Reusability. Often, the convenience of reusing an existing message transfer subsystem dominates developers' rationales for protocol selection. The central virtue of standardized protocols is that independently-written tools can interoperate, yet the emphasis on portability across a motley range of Internet hosts can lead to the dominance of a single code base. Consider the way Berkeley's BIND daemon determines "correct" behavior rather than the DNS specifications, or how SMTP came to be the sum of arbitrary choices in sendmail. The network effects of MIME, though, emphasize how standardization can effect separation of concerns. A MIME-savvy user interface component can be reused across a range of distribution protocols, from instant messaging to web pages.

Flexibility. The counterpoint to interoperability is flexibility, diversification within the standard. We have seen examples of adaptation in uses for all three aspects: accommodating multimedia messages, the use of mailbox addresses for devices like pagers and vending machines, and centralized netnews service. We can also speak of the flexibility of the system itself. On the Web, for example, the XML content type can mix and match tagsets from diverse ontologies within a single message; entirely new URI naming schemes can still be resolved by HTTP; and new methods and headers can mix in new transaction semantics, such as hit-metering or content-rating. Forward-compatibility with new message terminals is also desirable: content-negotiation for HTTP requests can make it easier to support new classes of devices, as well as alternate languages and file formats.


By now, we ought to be better equipped to discriminate between the various -ilities each of these Transfer-Layer protocol features exhibit and inhibit. Ultimately, we can apply this understanding towards synthesizing new, more flexible protocols. This enable more principled software designs with clearer communications requirements to let intelligent components sort out how and where to transfer its messages.

In other words, a single interface to message-oriented middleware, a hypothetical union I’d call *TP…


  1. TCP/IP Illustrated, Volume 3, W. Richard Stevens.
  2. TCP/IP Professional's Guide, Pete Loshin.
  3. Internetworking with TCP/IP , Douglas Comer.
  4. The Simple Book , Marshall T. Rose.
  5. Design Principles of the ARPANET , David D. Clark.
  6. Fundamentals of Software Engineering , Ghezzi, Jazayeri and Mandrioli.
  7. RFCs: Requests For Comments from the Internet Engineering Task Force [for FTP, SMTP, NNTP, HTTP, UDP, TCP] - this is the primary source material for this paper.
  8. Notes: How I Learned to Stop Worrying and Love HTTP by Rohit Khare and Adam Rifkin.
  9. Seventh Heaven, a column on application-layer protocol design in IEEE Internet Computing by Rohit Khare, replicated at