Thursday, November 12, 2009

Development of the Domain Name System


P. Mockapetris, K. Dunlap, "Development of the Domain Name System," ACM SIGCOMM Conference, 1988.


One line summary: This paper describes the design and deployment of the Domain Name System and reflects on its surprises, successes and shortcomings from the perspective of some years after its initial introduction.

Summary

This paper discusses the design and development of the Domain Name System (DNS), candidly reflecting on its history and future. DNS provides the naming service for the Internet that primarily maps hostnames to IP addresses, among other things. Prior to the development and adoption of DNS, the method for disseminating the mappings of hostnames to IP addresses was the HOSTS.TXT file, centrally maintained on a server at SRI and distributed to machines on the Internet via file transfers. As the number of hosts on the Internet grew it became clear that this approach was not scalable. Thus another was needed, and the one developed was the DNS. The design goals of the DNS were (1) it must provide the same information as the old HOSTS.TXT system, (2) it must be able to be maintained in a distributed manner, (3) it must have no obvious size limits for names and associated data, (4) it must interoperate across the Internet, and (5) it must provide reasonable performance. In the light of these constraints, it had to be extensible and avoid forcing a single OS, architecture, or organization onto its users. Initially, the design of DNS tried to balance between being very lean and being completely general.

DNS is a hierarchical system. The main components of this system are name servers and resolvers. Name servers contain mappings and answer queries. Resolvers provide the client interface, including algorithms to find a name server to query for the information desired by the client. The DNS name space is a tree, with each node having an associated label that is the concatenation of all the labels on the path from the root of the tree to that node. Labels are case-insensitive and the zero-length label is reserved for the root. DNS decouples the tree structure from implicit semantics in order to provide applications with more choices. Each name in the DNS is stored as a resource record (RR) and has an associated type and class field along with data. Some of the more common types include A records that map hostnames to IP addresses, PTR records that provide a reverse map from IP addresses to hostnames, and NS records that map zones to name servers. A zone is a contiguous section or subtree of the namespace that is controlled by specific organizations; for example, UC Berkeley controls berkeley.edu. The controlling organization is responsible for delegating subzones, maintaining the data of that zone, and providing redundant name servers for that zone. In addition through distributing data through these zones, DNS name servers and resolvers also cache RRs, which are removed from the cache after their TTL expires.

In addition to describing the basics of DNS, the paper describes its early implementation status and its first deployment in the Berkeley network. This was apparently painful but necessary. The paper then goes on to describe a number of surprising issues in the operation of the DNS. The first was that the assumption that the form and content of the information stored by the DNS was a poor one. A second is related to performance. The performance of the underlying network was initially much worse than the designers of DNS anticipated, and as a result DNS query performance was poor. Also, they found it difficult to make reasonable measurements of DNS performance due to the interference of unrelated effects. The third surprise was the high amount of negative responses to queries, leading to the need for caching such responses, which is referred to as negative caching. There were a number of very successful aspects of DNS as well. These include the choice of using a variable-depth hierarchy, the organization-oriented structure of names, the use of UDP to access name servers, the feature that allows a name server to include additional data in response to a query if it sees fit, allowing it to anticipate some requests, the use of caching, and the agreement to use organizationally structured domain names for mail addressing and routing. Lastly, shortcomings of the DNS include difficulty in defining new classes and types of RRs, difficulty in upgrading applications to use DNS, and difficulty in conveying to system administrators the correct way of configuring and using DNS. The paper concludes by offering several pieces of advice and describing some future directions for DNS.

Critique

I really liked this paper; I usually always enjoy papers that provide historical perspective and reflection on well-known systems, like this one and the “design of the DARPA Internet protocols” paper that we read at the beginning of the semester. I especially thought that a lot of the lessons learned and resulting advice contained in this paper was just good system-building advice, plain and simple. Some particular gems I liked include, “Documentation should always be written with the assumption that only the examples are read,” “It is often more difficult to remove functions from systems than it is to get a new function added,” “The most capable implementors [sic] lose interest once a new system delivers the level of performance they expect; they are not easily motivated to optimize their use of others’ resources or provide easily used guidelines for administrators that use the systems,” and “Allowing variations in the implementation structure used to provide service is a great idea; allowing variation in the provided service causes problems.”

No comments:

Post a Comment