Sara's Networks Class Blog: DHT

Showing posts with label DHT. Show all posts

Thursday, November 12, 2009

Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications

I. Stoica, R. Morris, D. Karger, F. Kaashoek, H. Balakrishnan, "Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications," ACM SIGCOMM Conference, 2001.

One line summary: This paper presents Chord, a protocol for a distributed lookup service built on a distributed hash table abstraction.

Summary

This paper presents a distributed lookup protocol called Chord that is simple and has provable correctness and performance properties. It is based on the distributed hash table (DHT) abstraction. It addresses the problems of load balancing, decentralization, scalability, availability, and flexible naming. Chord specifies how to map key/value pairs to storage nodes. It does this through the use of consistent hashing. This works as follows: each of the nodes participating in Chord is assigned an identifier based on the hash of its IP address. Each key also has an identifier that is the hash of the key. The identifiers form an identifier circle that has 2m places in it, where m is the length of the identifier. The identifier of a node determines its place in the circle. A key is assigned to the first node that has an identifier equal to or following it on the circle. The node that the key is assigned to is responsible for storing the corresponding key/value pair. Consistent hashing is used for this because it theoretically ensures that key/value pairs are roughly evenly distributed across nodes.

Each node in the circle stores a pointer to its successor so that queries for a key/value pair can be routed to the correct node. Each node also stores a finger table in which entry i contains a pointer to the node that is at least 2i-1 places ahead of it on the circle. As an optimization, each node can also store a pointer to its predecessor. The result of this setup is that a query for a key/value pair takes O(log N) steps to answer, where N is the number of nodes. When a node joins Chord it is logically placed in the spot on the circle to which its identifier maps. The predecessors and fingers of the new node and the other nodes are updated and all key/value pairs with identifiers equal to or following the new node’s identifier are transferred from the node where they were previously stored to the new node. To join the circle, a node only need know the identity of one other node already in Chord. From that node it can learn its place on the circle. Chord periodically runs a stabilization procedure in which each node verifies its successor. To deal with failures, each Chord node keeps a list of its n successors so that it can find a new one when its old one fails.

The paper next provides an evaluation of Chord. The authors first find that due to the non-uniform distribution of node identifiers, keys are not evenly mapped across the nodes, so they suggest a workaround involving virtual nodes. They next show that the average path length traversed by a query is ½ log N, where N is the number of nodes. They find that Chord experiences virtually no lookup failures due to node failures and very few lookup failures (~3%) for lookups that occur during stabilization. Lastly, they find that lookup latency grows slowly with the number of nodes.

Critique

I liked reading this paper because I’ve seen Chord used in a number of other systems. Briefly, I have only two main criticisms of this paper. First, the load balancing or distribution of keys among Chord nodes without using virtual nodes is disappointing but foreseeable. Second, it might be nice if Chord provided some optional automatic or built-in way of replicating keys across nodes. It seems straightforward to do so it would be nice if it were just automatically provided instead of left to the user to do.

Tuesday, November 3, 2009

Looking Up Data in P2P Systems

H. Balakrishnan, F. Kaashoek, D. Karger, R. Morris, I. Stoica, "Looking Up Data in P2P Systems," Communications of the ACM, V. 46, N. 2, (February 2003).

One line summary: This paper provides an overview of the various DHT-based data lookup mechanisms for peer-to-peer systems.

Summary

This paper begins by introducing the lookup problem in peer-to-peer (P2P) systems, which is that of finding data in a large P2P system containing a dynamic set of nodes in a scalable manner without introducing any centralization or hierarchy. The paper focuses primarily on solutions based on the distributed hash table (DHT) abstraction, after briefly outlining the problems with solutions involving a central database or hierarchical distributed storage structures. The main benefits of a DHT-based solution are that it has few constraints, including that all data items be uniquely identified with a numeric key and that nodes store data for other nodes. A DHT implements only a lookup operation that takes as input the identifying key of a data item and returns the location of the node that is responsible for that data item. A DHT lookup algorithm must address the following: mapping keys to nodes in a load-balanced way, forwarding lookups to the appropriate nodes, providing a distance function that describes the closeness of keys to each other and nodes to each other, and adaptively build routing tables.

The various DHT-based lookup algorithms reviewed in this paper include CAN, Chord, Kademlia, Pastry, Tapestry, and Viceroy. These vary in the data structures they use to provide O(log N) lookups, their routing dimensions, emphasis, distance functions, various operation costs, degree of fault tolerance and concurrency, routing, and security measures. For example, Chord maintains a skiplist-like data structure, while Pastry, Tapestry, and Kademlia use a tree-based structure. In contrast, CAN uses a d-dimensional Cartesian coordinate space as its abstraction. With respect to distance functions, Chord uses distance in the skiplist that is unidirectional and asymmetric, while Pastry uses prefixes and numeric distances and so is symmetric but not unidirectional, and Kademlia uses an XOR-based function that is both unidirectional and symmetric. Topics for future work in this area include analyzing operation costs when the system is large and joins and departures are frequent, improving routing latency, providing security mechanisms, and implementing efficient indexing and keyword search.

Critique

There isn’t much to say in terms of a critique since this paper is just a brief introduction to and description of various algorithms, but I did find it helpful and thought the writing was nicely structured, so I would keep it in the syllabus.

Wednesday, September 16, 2009

Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises

C. Kim, M. Caesar, J. Rexford, "Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises," ACM SIGCOMM Conference, (August 2008).

One line summary: This paper describes SEATTLE, an Ethernet-based architecture for use as a building block in large enterprise networks that uses a DHT to provide the same functionality as conventional Ethernet, but in a scalable and easy-to-administer way.

Summary

This paper discusses an alternative Ethernet-based protocol for use as a network building block in enterprise and access provider networks called SEATTLE. It is motivated by the observation that despite it’s usefulness especially concerning its plug-and-play semantics, conventional Ethernet scales poorly and has several limitations. (1) It disseminates every host’s location globally using flooding, resulting in high control overhead and large forwarding table sizes, (2) it forces paths to make up a spanning tree, constraining route selection, and (3) bootstrapping protocols such as ARP and DHCP rely on broadcasts for frequent and basic operations, degrading network performance. One alternative that attempts to address some of the problems with Ethernet is to decompose the network into multiple LANs interconnected by IP routing. The disadvantages of this approach include (1) configuration overhead due to hierarchical addressing and difficulty in maintaining consistency between routers and DCHP servers, (2) complexity in implementing network policies, since these are often based on the IP prefix of a host, which is subject to change, and (3) limited support for host mobility, making it difficult for hosts to move from one subnet to another. An additional approach that uses virtual LANs (VLANs) instead of LANS is used to address these deficiencies in turn, however, it too has its limitations. These include (1) overhead in configuring trunks – deciding which bridges should be in a given VLAN must often be done manually, (2) large forwarding table entries on bridges in multiple VLANs, and (3) a single spanning tree in each VLAN limits route selection and requires manual rebalancing and updating to adjust to shifts in network load.

SEATTLE in turn addresses all of these problems. It maintains MAC-IP and MAC-location mappings using a one-hop DHT. This information is hashed onto switches using consistent hashing to reduce re-hashing overhead when a switch fails and keys must be remapped. Among switches, a link-state routing protocol allows routing between switches along the shortest path. SEATTLE allows for the creation of virtual switches for load balancing i.e. more powerful switches can have more virtual switches and thus a higher load. Also, the DHT can be used to store information about various network services or attributes, such as printers and servers, by hashing a descriptor of that service along with its location or other information. SEATTLE can also be configured to use a multi-level one-hop DHT to run over hierarchical networks with several regions connected by backbones. In this case, separate regional and backbone hash rings are maintained. SEATTLE is backwards compatible with end hosts using conventional Ethernet. SEATTLE also allows for the definition of groups, which is defined as a set of hosts that share the same broadcast domain. This is important for backward compatibility with conventional Ethernet and also facilitates the implementation of reachability policies.

The paper then provides results of several simulations using four topologies that the authors consider representative. They measure SEATTLE’s sensitivity to cache eviction timeouts, network changes, host mobility, switch failure, the forwarding table size, its control overhead, and compare it with other approaches. They describe a prototype implementation of SEATTLE.

Critique

I liked this paper. There was quite a bit of time between me reading it and writing the summary so I can’t remember specific criticisms or points I wanted to make about it. They did a good job of explaining what was wrong with current solutions and I liked how they addressed each of these problems. One thing that confused me in their simulations is that they said they leveraged real network traces and supplemented them with synthetic traces, but I must not understand fully how these are used because it seems like once you have a different building block (SEATTLE in place of Ethernet) that alters the trace so a sequence of events in the original trace might no longer make sense if you are assuming you are using SEATTLE instead. One thing I liked about their simulations is that, unlike in many of the other papers we’ve read, at least their simulation networks were big. I feel like a lot of other experiments that use topologies with only a handful of hosts are not as interesting, although in many cases there is probably nothing wrong with evaluating in this way.

Sara's Networks Class Blog