Sara's Networks Class Blog: DNS

J. Jung, E. Sit, H. Balakrishnan, "DNS Performance and the Effectiveness of Caching," IEEE/ACM Transactions on Networking, V. 10, N. 5, (October 2002).

One line summary: This paper studies the Domain Name System with respect to client-perceived performance and the effectiveness of caching by analyzing several sets of real trace data including DNS and TCP traffic.

Summary

This paper examines DNS performance from the perspective of clients by analyzing DNS and related TCP traffic traces collected at MIT and the Korea Institute of Science and Technology (KAIST). It seeks to answer two questions: (1) what the performance perceived by clients in terms of latency and failures is, and (2) how varying the TTL and cache sharing impacts the effectiveness of caching. The paper first provides an overview of DNS. It then explains the methodology. The study uses three sets of traces that include outgoing DNS queries and responses and outgoing TCP connections. It collects various statistics on these traces for analysis, including, for a given query, the number of referrals involved, the lookup latency, and whether the query was successful. The dominant type of query the authors observed was A, at approximately 60%, followed by PTR at approximately 30% and MX and ANY making up a roughly equal share of the remainder. Only 50% of the queries were associated with TCP connections and 20% of the TCP connections were not preceded by DNS queries.

The rest of the paper is organized with respect to the two main questions it attempts to answer. First, they examine client-perceived performance. They find median lookup latencies of 85 and 97 ms. 80% of lookups are resolved without any referrals, and the number of referrals increases the overall lookup latency. About 20-24% of lookups receive no response. Of the answered lookups, 90% involved no retransmissions. Queries that receive no referrals generate much more wide-area traffic than those that do. These last three facts suggest that many DNS name servers are too persistent in their retry strategies. They also find that most of the negative responses are due to typos and reverse lookups for addresses that don’t have a host name and that negative caching does not work as well as it should, probably because the distribution of names causing negative responses is heavy tailed. They find 15-18% of lookups contact root or gTLD servers and 15-27% of such lookups receive negative responses. Next they examine the effectiveness of caching. Because the distribution of domain name popularity is very long tailed, many names are accessed just once, so having a low TTL doesn’t hurt because caching doesn’t really help. However, not caching NS records would greatly increase loads to root and gTLD servers, so TTL values should not be set too low for these. Also due to the distribution of domain name popularity, sharing caches does does not increase the hit rate very much after a certain point, because while some names are very popular, the rest are likely to be of interest to just one host. They also found that increasing TTL value past a few minutes does not greatly increase the cache hit rate because most cache hits are produced by single clients looking up the same server multiple times in a row. This summarizes most of their main findings.

Critique

I always really like papers that examine systems like DNS in this way because I think it’s interesting that even though we built it, we don’t always understand and aren’t able to predict the kinds of observations that we later make in studies like this. As always, it would be interesting again to go back and do a similar study now that about a decade of time has passed and see what turns up. One thing in this paper that I am confused about is how the authors could have found that such a significant fraction of lookups never receive an answer. I don’t think they provide much in the way of an explanation for this, if I remember correctly. Are these unanswered lookups just due to dropped packets? Because that seems like a lot of dropped packets, and I guess I find that surprising, although I have no real basis for thinking so.

P. Mockapetris, K. Dunlap, "Development of the Domain Name System," ACM SIGCOMM Conference, 1988.

One line summary: This paper describes the design and deployment of the Domain Name System and reflects on its surprises, successes and shortcomings from the perspective of some years after its initial introduction.

Summary

This paper discusses the design and development of the Domain Name System (DNS), candidly reflecting on its history and future. DNS provides the naming service for the Internet that primarily maps hostnames to IP addresses, among other things. Prior to the development and adoption of DNS, the method for disseminating the mappings of hostnames to IP addresses was the HOSTS.TXT file, centrally maintained on a server at SRI and distributed to machines on the Internet via file transfers. As the number of hosts on the Internet grew it became clear that this approach was not scalable. Thus another was needed, and the one developed was the DNS. The design goals of the DNS were (1) it must provide the same information as the old HOSTS.TXT system, (2) it must be able to be maintained in a distributed manner, (3) it must have no obvious size limits for names and associated data, (4) it must interoperate across the Internet, and (5) it must provide reasonable performance. In the light of these constraints, it had to be extensible and avoid forcing a single OS, architecture, or organization onto its users. Initially, the design of DNS tried to balance between being very lean and being completely general.

DNS is a hierarchical system. The main components of this system are name servers and resolvers. Name servers contain mappings and answer queries. Resolvers provide the client interface, including algorithms to find a name server to query for the information desired by the client. The DNS name space is a tree, with each node having an associated label that is the concatenation of all the labels on the path from the root of the tree to that node. Labels are case-insensitive and the zero-length label is reserved for the root. DNS decouples the tree structure from implicit semantics in order to provide applications with more choices. Each name in the DNS is stored as a resource record (RR) and has an associated type and class field along with data. Some of the more common types include A records that map hostnames to IP addresses, PTR records that provide a reverse map from IP addresses to hostnames, and NS records that map zones to name servers. A zone is a contiguous section or subtree of the namespace that is controlled by specific organizations; for example, UC Berkeley controls berkeley.edu. The controlling organization is responsible for delegating subzones, maintaining the data of that zone, and providing redundant name servers for that zone. In addition through distributing data through these zones, DNS name servers and resolvers also cache RRs, which are removed from the cache after their TTL expires.

In addition to describing the basics of DNS, the paper describes its early implementation status and its first deployment in the Berkeley network. This was apparently painful but necessary. The paper then goes on to describe a number of surprising issues in the operation of the DNS. The first was that the assumption that the form and content of the information stored by the DNS was a poor one. A second is related to performance. The performance of the underlying network was initially much worse than the designers of DNS anticipated, and as a result DNS query performance was poor. Also, they found it difficult to make reasonable measurements of DNS performance due to the interference of unrelated effects. The third surprise was the high amount of negative responses to queries, leading to the need for caching such responses, which is referred to as negative caching. There were a number of very successful aspects of DNS as well. These include the choice of using a variable-depth hierarchy, the organization-oriented structure of names, the use of UDP to access name servers, the feature that allows a name server to include additional data in response to a query if it sees fit, allowing it to anticipate some requests, the use of caching, and the agreement to use organizationally structured domain names for mail addressing and routing. Lastly, shortcomings of the DNS include difficulty in defining new classes and types of RRs, difficulty in upgrading applications to use DNS, and difficulty in conveying to system administrators the correct way of configuring and using DNS. The paper concludes by offering several pieces of advice and describing some future directions for DNS.

Critique

I really liked this paper; I usually always enjoy papers that provide historical perspective and reflection on well-known systems, like this one and the “design of the DARPA Internet protocols” paper that we read at the beginning of the semester. I especially thought that a lot of the lessons learned and resulting advice contained in this paper was just good system-building advice, plain and simple. Some particular gems I liked include, “Documentation should always be written with the assumption that only the examples are read,” “It is often more difficult to remove functions from systems than it is to get a new function added,” “The most capable implementors [sic] lose interest once a new system delivers the level of performance they expect; they are not easily motivated to optimize their use of others’ resources or provide easily used guidelines for administrators that use the systems,” and “Allowing variations in the implementation structure used to provide service is a great idea; allowing variation in the provided service causes problems.”

Sara's Networks Class Blog

Thursday, November 12, 2009

DNS Performance and the Effectiveness of Caching

Development of the Domain Name System

Blog Archive

About Me