Monday, August 31, 2009

The Design Philosophy of The DARPA Internet Protocols


D. D. Clark, "The Design Philosophy of the DARPA Internet Protocols," ACM SIGCOMM Conference, (August 1988).

One line summary: This paper examines the underlying goals and logic that motivated the design of the DARPA Internet Architecture.

Summary

The primary goal of the DARPA Internet Architecture was to connect existing interconnected networks using multiplexing. The multiplexing technique selected was packet switching, and it was assumed that networks would be connected by packet switches called gateways. The secondary goals were sevenfold and included, in order of importance, resiliency to partial failures, support for multiple types of communications services, accommodation of a variety of networks, capacity for distributed management, cost effectiveness, easy host attachment, and accountability.

These goals directly influenced the resulting design. For instance, as a consequence of the first goal, the Internet architects chose as protection against failure an approach called fate-sharing, which has several ramifications, among them, that intermediate nodes have no essential information about ongoing connections passing through them. As a consequence of the second goal relating to support for a variety of services, the designers tried to make TCP as general as possible, but found that it was too difficult to build the wide range of services needed for such support into one protocol and that more than one transport service would be necessary. As a consequence of the third goal that the Internet should operate over a wide variety of networks, the architecture had to make very few assumptions about the underlying capabilities of the network. Here, the author uses a sort of end-to-end-argument to explain why certain services were engineered at the transport level.

The author admits that of the seven secondary goals, the first three discussed above had the largest impact on the design of the Internet, but the last four were less effectively met. This fact itself has also had a large impact on the state of the Internet today. For example, the author explains that some of the biggest problems with the Internet today have to do with insufficient capacity for distributed management.

Critique

This paper is important because it provides insight into the beginnings and evolution of the Internet. Given its complexity, such historical perspective is useful for addressing existing problems in the Internet and designing new protocols and applications. Since hindsight is 20/20, there are a number of now obvious criticisms that could be made of the designers of the Internet. Several are made in the paper pertaining to the failure to fully achieve the last four secondary goals, as well as to the difficulty in providing guidance to implementers, especially with respect to performance. It is good that the author pointed out these shortcomings.

I would imagine that some of these criticisms, such as the one related to distributed management, have been addressed since the paper was written, while others remain an issue. One major oversight on the part of the designers of the Internet, as is often pointed out, was their failure to consider the possibility of malicious users on the network, how these affect the security and stability of the whole Internet, and what provisions should be taken to counteract them. This is somewhat ironic since, as the paper says “the network was designed to operate in a military context.” Today, so-called cybersecurity is an area of intense concern to the military, yet perhaps if the original designers had had some foresight into this issue, it would be less of a problem.

As a final note, I found it interesting that in the conclusion the author speaks of “the next generation of architecture” and potential alternative building blocks to the datagram. We still rely on a version of TCP/IP today, albeit an improved version. It has proven to be very challenging, if not entirely infeasible, to change the underlying architecture of the Internet at this point. The push to switch to IPV6 is one example of the difficulties involved. Perhaps this is another failing of the designers. I'd be interested to know more about what the author had in mind here.

End-to-End Arguments in System Design


J. H. Saltzer, D. P. Reeed, D. D. Clark, "End-to-End Arguments in System Design," 2nd International Conference on Distributed Computing Systems, Paris, (April 1981), pp. 509-512.


One line summary: This paper presents the “end-to-end argument,” a design principle for guiding placement of functions in a distributed system that argues for moving function implementation upward and closer to the application level in certain scenarios to avoid redundancy and added cost.


Summary

The end-to-end argument with respect to communications systems can be summarized as follows: most functions can only be “completely and correctly” implemented with the knowledge of the end-point applications, so providing such functions at the lower layers below the application is not sufficient. Such functions will usually have to be re-implemented in some form at the application level, resulting in redundancy and potentially decreased performance. The authors provide several example functions in communication networks to which the end-to-end argument can be applied, including end-to-end reliability, delivery acknowledgment, secure data transmission, duplicate message suppression, and FIFO message delivery. They also provide both current and historical instances in which the end-to-end argument has been used outside of communication networks, including in transaction management, in support of RISC computer architecture and in OS “kernalization” projects.

The authors are careful to point out that the end-to-end argument is not an absolute rule that can be applied in all situations. To illustrate they give the example of a computer communication network that caries voice packets. In the first scenario, these voice packets are being carried between two users on telephones in real-time conversation. If in this case, reliability is implemented at the lower layers via the usual means of retransmissions and acknowledgments, this will likely introduce unacceptable delays into the application, rendering conversation difficult. Instead, the application would do better to accept damaged or missing packets and rely on higher-level mechanisms to achieve the needed reliability, such as one user asking the other to repeat something if it was unintelligible. In this scenario the end-to-end argument clearly applies. In the second scenario, voice packets are being transmitted from the sender to storage for later listening, like in voicemail. In this case, delays are no longer disruptive, and in fact reliability and message correctness gain importance, since the listener can’t ask the speaker to repeat something. Thus, in this case, low-level implementation of reliability would be acceptable, and the end-to-end argument does not necessarily hold.

Critique

The important contribution of this paper is its identification and articulation of a central theme that has guided designers of many varieties of computer systems up to today. An extension of this approach is a layered architecture design, although the end-to-end argument is arguably more general than that. As the authors themselves point out, this principle has provided important guidance and support in not only networks, but also operating systems and hardware design, among others.

One question that came to my mind as I read the paper are: what is the counter-argument to the end-to-end argument? Are there certain functions or situations for which implementation of the function at the low levels, or redundantly at a number of levels, is called for? I haven’t thought of any particular examples yet but I’ve been pondering.