Sara's Networks Class Blog: multicast

S. Floyd, V. Jacobson, S. McCanne, C-G Liu, L. Zhang, "A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing," ACM SIGCOMM Conference, (August 1995).

One line summary: This paper describes Scalable Reliable Multicast, a minimal framework from which applications designers can build multicast functionality suitable to their application; it also describes some analytical results pertaining to request/repair algorithms in multicast, along with several simulations.

Summary

This paper describes a reliable multicast framework called Scalable Reliable Multicast (SRM). SRM builds off of the principles of Application Level Framing (ALF), which “explicitly includes an application’s semantics in the design of that application’s protocol’, and Light-Weight Sessions (LWS), which centers on a “light-weight rendezvous mechanism based on the IP multicast distribution model” with receiver-based adaption. Thus, the SRM is designed to meet only the minimal definition of reliable multicast so as to not force on applications unnecessary overhead for functionality that they do not need. It is also designed to adhere to the core principles of TCP/IP in that it only requires the basic IP delivery model and dynamically adjusts control parameters based on observed performance, much like TCP. The authors argue that receiver-based reliability is more appropriate for SRM than sender-based reliability because the fate-sharing-based coupling of unicast does not generalize well to multicast (due to such factors as the ACK implosion effect and the need for the sender to maintain state about each of the receivers in the receiver set), and because the vocabulary of unicast conventions migrates poorly to multicast. SRM attempts to serve as a skeleton common to scalable, reliable multicast applications that supply the framework with details such as a namespace, policies and mechanisms for apportioning bandwidth, etc.

The authors go on to describe a network conferencing tool that provides a distributed whiteboard called wb, which builds off of SRM. In wb, users are members, each of whom has a globally unique identifier which is used to label pages they create and edit. Wb assumes that all data has a unique name, that the name always refers to the same data, that source-IDs are persistant, that IP multicast datagram delivery is available, and that all participants join the same multicast group. The paper describes wb’s instantiation of SRM. It then describes more generally request/repair algorithms for several simple topologies, including chains, stars, and bounded degree trees. In the context of these algorithms, it defines deterministic suppression of duplicate messages, and probabilistic suppression. They find, via simulations, that their algorithm that uses fixed timer parameters performs well in random or bounded degree trees when every node in the tree is a member of the multicast group. They use this to motivate the development of an adaptive algorithm for request/repair that adjust timer parameters as a function of delay as well as of duplicate requests or repairs in recent recovery exchanges. They demonstrate trade-offs between low delay and low number of duplicates.

Critique

I didn’t really like reading this paper. I don’t really like simulations, for one. Also, that the authors chose what they admit to be arbitrary values for some of their parameters in their algorithm annoys me. Nevertheless, however simplified the topologies in their analysis section, I would probably agree that it is good to have some mathematical results for many problems, in this case those pertaining to request/repair algorithms in multicast, in order to provide some intuition to people implementing actual systems and to help guide their choices. In that vein, I would be interested in knowing more about any work that really directly builds off of or uses their idea of a framework for scalable, reliable multicast and learn in exactly what way this particular paper was useful. On an unrelated note, it also very seriously annoyed me that the graphs in this paper lacked axis or data labels of any kind.

S. Banerjee, B. Bhattacharjee, C. Kommareddy, "Scalable Application Layer Multicast," ACM SIGCOMM Conference, (August 2002).

One line summary: This paper describes NICE, a new application-layer multicast protocol intended primarily for low-bandwidth, data streaming applications with large receiver sets.

Summary

This paper describes a new application-layer multicast protocol for low-bandwidth, data streaming applications with large receiver called NICE. The problem with network-layer multicast, as opposed to application-layer multicast, is that it has not been widely adopted by most ISPs, so much of the Internet is incapable of native multicast. In application-layer multicast, network infrastructure is left unchanged, while the multicast forwarding functionality is implemented at end-hosts. Application-layer multicast protocols are less efficient than network-layer multicast protocols because they must send the same packet over the same link. Thus, two good metrics by which to judge application-layer multicast protocols are stress and stretch. Stress is defined per link and is the count of the number of identical packets sent by a protocol over each underlying link in the network. Stretch is defined per multicast group member and is the ratio of the path length from the source to the member along the overlay used by the multicast protocol to the length of the direct unicast path.

The way the NICE protocol works is by logically arranging the multicast group members into a hierarchy. Each host in the hierarchy belongs to a layer, and the hosts in each layer are partitioned into a set of clusters, where each cluster has a cluster leader that is the graph-theoretic center the cluster. The size of each cluster is bounded between k and 3k-1 and clusters consist of hosts that are close to each other. Hosts are organized into layers according to the rule that all hosts are part of the lowest layer, L(0), then all the cluster leaders of all the clusters in layer L(i) join layer L(i+1). A host is allowed to belong to only a single cluster at any layer. If a host is present in layer L(i) it must be a cluster leader in each of layers L(0), ..., L(i-1). If a host is not present in a layer L(i) it cannot be present in any layer L(j) where j > i. Lastly, there are allowed to be at most logkN layers, and the highest layer has only a single member. The supercluster of any host x is defined as the cluster in the next highest layer to which x’s cluster leader y belongs.

The hierarchy is then used to define different overlay structures for control messages and data delivery paths. In NICE, the control topology is a clique and the data topology is a star, but it is possible to choose other topologies. The NICE protocol assumes the existence of a special host that all hosts know of called the Rendezvous Point (RP), and the RP is always the leader in the cluster at the highest layer of the hierarchy. The NICE protocol consists of three main components: (1) initial cluster assignment when a new host joins, (2) periodic cluster maintenance and refinement, and (3) recovery from leader failures. In initial cluster assignment, the joining hosts contacts all the members in the highest layer to identify the member closest to it, then contacts the lower-level cluster of that closest member to find the closest member there, and so on until the hosts finds its L(0) cluster. This process involves O(k log N) query-response pairs. In this process, some the center of some clusters may change, so a new cluster leader must be chosen. To aid in cluster maintenance and refinement, each member of a cluster sends periodic heartbeat messages to the other members of its cluster. There is also a method for cluster splitting and merging when to keep clusters within the size bound, as well as a method for refining attachments in case a host is not able to locate the closest cluster in a layer when joining. Leader failures are dealt with using a remove message if the leader is able to do a graceful-leave, or by detecting failures using the heartbeat messages and selecting a new leader.

To evaluate NICE, the authors first simulated their protocol and compared it to three other schemes: multi-unicast, native IP multicast (CBT), and the Narada application-layer multicast protocol. Their main findings are that NICE data paths have stretch comparable to Narada, the stress on links and routers is lower in NICE, especially as the multicast group size increases, the failure recovery of both schemes are comparable, and this performance is achievable with orders of magnitude lower control overhead for group sizes greater than 32. The authors also implemented the NICE protocol and tested their implementation in a wide-area testbed over a period of a month. They observe a maximum packet loss of 1% as members join and leave the group at random and an average control overhead of less than 1 Kbps for groups of size 100.

Critique

I liked this paper and it was informative to read about application-layer multicast as opposed to network-layer multicast. Even though the performance at the application-layer is, from what I can see, pretty much unavoidably worse than at the network-layer, if the authors are right that ISPs do not tend to support native multicast then that is a good enough reason to use the application-layer multicast despite the performance hit. I thought this paper did a nice job of describing how the NICE hierarchy is built but the authors could have done a better job explaining the control path and data paths used. Overall, as it appears that there are many application-layer multicast protocols that have been developed, this one seemed like as fine a choice as any to read about, though it’s hard to say without having read most of the other application-layer multicast protocol papers. It might be nice to read a survey paper like we did for DHT-based lookup protocols that summarizes the different kinds of multicast protocols and their differences.

Sara's Networks Class Blog

Tuesday, December 1, 2009

A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing

Scalable Application Layer Multicast

Blog Archive

About Me