Sara's Networks Class Blog: Jacobson

Showing posts with label Jacobson. Show all posts

Tuesday, December 1, 2009

A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing

S. Floyd, V. Jacobson, S. McCanne, C-G Liu, L. Zhang, "A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing," ACM SIGCOMM Conference, (August 1995).

One line summary: This paper describes Scalable Reliable Multicast, a minimal framework from which applications designers can build multicast functionality suitable to their application; it also describes some analytical results pertaining to request/repair algorithms in multicast, along with several simulations.

Summary

This paper describes a reliable multicast framework called Scalable Reliable Multicast (SRM). SRM builds off of the principles of Application Level Framing (ALF), which “explicitly includes an application’s semantics in the design of that application’s protocol’, and Light-Weight Sessions (LWS), which centers on a “light-weight rendezvous mechanism based on the IP multicast distribution model” with receiver-based adaption. Thus, the SRM is designed to meet only the minimal definition of reliable multicast so as to not force on applications unnecessary overhead for functionality that they do not need. It is also designed to adhere to the core principles of TCP/IP in that it only requires the basic IP delivery model and dynamically adjusts control parameters based on observed performance, much like TCP. The authors argue that receiver-based reliability is more appropriate for SRM than sender-based reliability because the fate-sharing-based coupling of unicast does not generalize well to multicast (due to such factors as the ACK implosion effect and the need for the sender to maintain state about each of the receivers in the receiver set), and because the vocabulary of unicast conventions migrates poorly to multicast. SRM attempts to serve as a skeleton common to scalable, reliable multicast applications that supply the framework with details such as a namespace, policies and mechanisms for apportioning bandwidth, etc.

The authors go on to describe a network conferencing tool that provides a distributed whiteboard called wb, which builds off of SRM. In wb, users are members, each of whom has a globally unique identifier which is used to label pages they create and edit. Wb assumes that all data has a unique name, that the name always refers to the same data, that source-IDs are persistant, that IP multicast datagram delivery is available, and that all participants join the same multicast group. The paper describes wb’s instantiation of SRM. It then describes more generally request/repair algorithms for several simple topologies, including chains, stars, and bounded degree trees. In the context of these algorithms, it defines deterministic suppression of duplicate messages, and probabilistic suppression. They find, via simulations, that their algorithm that uses fixed timer parameters performs well in random or bounded degree trees when every node in the tree is a member of the multicast group. They use this to motivate the development of an adaptive algorithm for request/repair that adjust timer parameters as a function of delay as well as of duplicate requests or repairs in recent recovery exchanges. They demonstrate trade-offs between low delay and low number of duplicates.

Critique

I didn’t really like reading this paper. I don’t really like simulations, for one. Also, that the authors chose what they admit to be arbitrary values for some of their parameters in their algorithm annoys me. Nevertheless, however simplified the topologies in their analysis section, I would probably agree that it is good to have some mathematical results for many problems, in this case those pertaining to request/repair algorithms in multicast, in order to provide some intuition to people implementing actual systems and to help guide their choices. In that vein, I would be interested in knowing more about any work that really directly builds off of or uses their idea of a framework for scalable, reliable multicast and learn in exactly what way this particular paper was useful. On an unrelated note, it also very seriously annoyed me that the graphs in this paper lacked axis or data labels of any kind.

Tuesday, September 15, 2009

Random Early Detection Gateways for Congestion Avoidance

S. Floyd, V. Jacobson, "Random Early Detection Gateways for Congestion Avoidance," IEEE/ACM Transactions on Networking, (August 1993).

One line summary: This paper presents an algorithm for congestion avoidance called Random Early Detection (RED) that drops packets probabilistically when the computed average queue length is between a minimum and maximum threshold in order to avoid congestion.

Summary

This paper presents an algorithm for congestion avoidance called Random Early Detection (RED). The authors argue that the most effective congestion detection can occur at gateways as opposed to the endpoints. In the current Internet, TCP only detects congestion when a packet has been dropped, meaning the queue at the gateway is full. Large queues at gateways are especially undesirable in networks with large delay-bandwidth product connections as it can significantly increase the average delay in the network. The authors also assume that it is useful to have queues at gateways in which traffic is multiplexed together with FIFO scheduling, and that per-connection gateway mechanisms such as those used in Fair Queueing should only be used when not having per-flow mechanisms is clearly inadequate. As RED does not use per-connection mechanisms, it is intended for networks where transport protocols are cooperative and respond to congestion indications, although it can be used to control congestion even when sources are not cooperative. In addition to the goals of keeping average queue size low and detecting congestion, RED places emphasis on avoiding global synchronization of TCP connections and avoiding bias against bursty traffic.

The RED algorithm works by first calculating the average queue size using a low-pass filter with an exponential weighted moving average. RED defines two thresholds, minimum and maximum. When the average queue size is below the minimum threshold, no packets are dropped. When it is above the maximum threshold, all packets are dropped. When it is between the two thresholds, each packet from the connections in the queue may be dropped with a probability that is a function of the average queue size, and this probability uniformly distributed and is proportional to that connection’s share of the bandwidth at the gateway. Later in the paper, the authors discuss setting the parameters for the algorithm. The weight in calculating the average should be selected to be not so low that it responds too slowly to changes in the actual queue size but not so high that the average fluctuates too wildly in response to bursty traffic. The minimum threshold shouldn’t be too low if traffic is fairly bursty. The maximum threshold depends on how high of an average queue length is tolerable. A rule of thumb is that the difference in the threshold values should be larger than the typical queue length increase in one RTT, in order to avoid global synchronization.

The authors state that RED meets the following goals: congestion avoidance, congestion detection at the same time scale as connections can respond to congestion, no global synchronization, simplicity, maximizing the throughput-to-delay ratio, no bias against bursty traffic (which the authors somewhat misleadingly term “fairness”), and appropriateness for a wide range of network environments. To show this they conduct a number of simulation experiments that compare RED with Drop Tail and Random Drop gateways. They also show how RED gateways can be used to identify connections using more than their fair share of bandwidth by noting which connections are having more of their packets dropped, as the probability of a connection having its packets dropped is assumed to be equal to that connection’s fraction of the link bandwidth through the gateway. Lastly, they provide an efficient implementation of the RED algorithm.

Critique

I didn’t really like this paper. I found it interesting that they argued for FIFO queueing, and I’m not sure I found their arguments as compelling after reading the other congestion avoidance papers we read. I also found their talk about their algorithm being “fair” rather misleading, since they don’t mean fair in the accepted sense. I think they gloss over the issue of true fairness as it was talked about in the Fair Queueing paper a little bit too much. Also, proper setting of the parameters of their algorithm seems to be important, but despite their rough guidelines it may not always be clear what the proper setting is. This seems like an undesirable property of their algorithm. It does seem like a step up from Drop Tail and a smaller step up from Random Drop, I will give them that. However, as I think we mentioned in class, I don’t think that this has led it to be widely implemented anyway. On more of a nit-picky note, I thought that a lot of their phrasing was awkward and their graphs were absolute crap (difficult to read, didn’t even have legends). I didn’t think the paper needed to be 20 pages long.

Monday, September 7, 2009

Congestion Avoidance and Control

V. Jacobson, M. Karels, "Congestion Avoidance and Control," ACM SIGCOMM Conference, (August 1988).

One line summary: This paper describes several important algorithms for congestion control in TCP, including slow-start, exponential backoff, multiplicative decrease additive increase, and RTT variance estimation; the paper also provides results that demonstrate that implementations of TCP using the algorithms described in the paper are better than older implementations.

Summary

The authors note that due to an explosive increase in size, computer networks have increasingly experienced congestion problems. Seven algorithms added to TCP are meant to help deal with this: round-trip time (RTT) variance estimation, exponential backoff, slow start, aggressive receiver ACK policy, dynamic window sizing, Karn’s algorithm, and fast retransmit. The authors describe the first five of these algorithms. According to the authors, these algorithms stem from the observation that a TCP connection should obey a “conservation of packets” principle, which basically states that for a connection in equilibrium, the sender should not place a new packet into the connection until an old packet leaves. There are three instances in which this principle will not be upheld: when the connection doesn’t initially reach equilibrium, when after equilibrium is reached, a new packet is sent before an old packet leaves, and when resource limitations along the connection path make equilibrium unattainable. The authors address each of these in turn.

The first failure occurs when a connection is being started or restarted. The authors note that a TCP connection is a self-clocking system, which help makes it stable, however, this also means that in order to get packets flowing there must be ACKS to clock them out, but to get ACKS there must be packets flowing. The authors thus describe the slow-start algorithm as a method for “starting the clock” and gradually increasing the number of packets flowing. The slow-start algorithm is as follows: For each connection, store a variable called the congestion window that will be used to determine the number of packets to send. When starting or restarting, set the congestion window to 1 packet. Every time a packet is acknowledged, increase the congestion window by 1. To select how many packets to send at a time, choose the minimum of the congestion window and the receiver’s advertised window. The result of this algorithm is that if no packets are lost or delayed, the congestion window doubles every RTT, and never will the sender send packets at a rate faster than twice the maximum send rate on the connection path. The authors find that this algorithm greatly decreases the number of retransmits and increases effective bandwidth.

The authors next explain that the second failure must be due to a sender’s faulty retransmit timer. This timer should estimate the RTT, which increases when load increases, in order to determine when a packet should be retransmitted, but one mistake in doing this is not estimating the variance in the RTT. The authors present an easy-to-calculate method for estimating the variance in their appendix. Another mistake is not choosing the correct backoff after a packet is retransmitted. The authors argue that only exponential backoff schemes have any hope of working in the general case. The authors demonstrate that a timer that estimates RTT using their method more accurately approximates RTT than the method previously used in TCP.

Lastly, the authors point out that if the timers are doing a good job of estimating RTT, then a timeout probably indicates a lost packet instead of a broken timer. Lost packets are typically a sign that the network is congested, causing connections to encounter resource limits, which is the third failure case. Connection endpoints must have a way of responding to congestion such that they decrease the amount they send when the network is congested and increase the amount they send when it is possible to do so, in order to achieve the proper utilization of network resources while avoiding congestion. To this end, the authors introduce a multiplicative decrease additive increase algorithm. The reason the decrease happens much faster than the increase is because it is easy for the network to become congested and much harder for it to become uncongested. The algorithm works as follows: On a timeout, the congestion window is set to half of the current congestion window size (multiplicative decrease). For each packet acknowledgement received thereafter, increase the congestion window (cwnd) by 1/cwnd (additive increase). Then when choosing the number of packets to send, choose the minimum of the congestion window and the receiver’s advertised window. The authors note that this does not include slow-start because they are two entirely different algorithms, but they should be implemented together, which is done usually by having separate slow-start and congestion avoidance phases. The authors show that their congestion avoidance approach results in fewer retransmits and higher bandwidth utilization.

Critique

The algorithms the authors propose in this paper later made it into newer implementations of TCP, so in that sense, they were very successful. Their results demonstrate that their algorithms were generally an improvement over previous implementations. However, as the authors point out, there are still improvements to be made. For instance, none of their solutions ensure fair sharing of network bandwidth, so they propose this as an area of future work. (Then again, according to Kurose and Ross, the multiplicative decrease additive increase algorithm was later shown to converge to provide equal sharing of bandwidth by Chui in 1989, a year after this paper was published, so perhaps it was the case that this work did not need to be done after all.)

Also, in certain situations, TCP’s exponential backoff and congestion avoidance can cause serious performance problems. Regular TCP can be improved for high-performance networks and there are apparently some extensions to TCP for this. Investigation reveals that some of the later changes to TCP’s congestion avoidance algorithms were suggested in order to allow it to more appropriately handle networks with LFNs and high-latency links. Lastly, some of the assumptions that the authors make about the network aren’t necessarily always valid. For instance, the authors state that packets are lost due to either network congestion or packet damage, but that the latter occurs so infrequently that it is safe to assume that a timeout is almost always a sign of network congestion. However, in wireless networks, this assumption does not generally hold, so TCP’s congestion avoidance can in this case result in link underutilization. Thus, a modified mechanism that takes this into account would be appropriate for TCP in wireless networks.

I thought this paper was good, and obviously the algorithms described in the paper are important, as they are now an integral part of TCP. Congestion control is a very important service, without which the Internet could not function well, if at all. I was somewhat confused as to how much of the algorithms suggested were original contributions, since some of the footnotes seem to indicate very similar algorithms had first been suggested elsewhere. I found that the authors’ tendency to reference a handful of papers I hadn’t read without really explaining briefly what was in those papers inhibited my understanding. And just as a matter of personal taste, I did not like it that so much information was crammed into the footnotes and appendices; to me this made the paper less readable, although aside from this it was well-written.

[EDIT: From my comment about fairness as examined by Chui it's probably clear that I read this paper before I read the other one. ;-) ]

Sara's Networks Class Blog