Friday, October 30, 2009

Active Network Vision and Reality: Lessons from a Capsule-Based System


D. Wetherall, "Active Network Vision and Reality: Lessons from a Capsule-Based System," 17th Symposium on Operating Systems Principles," (December 1999).


One line summary: This paper examines some of the points of debates concerning active networks by describing the experience of designing and implementing an active network toolkit called ANTS; it explores some of the benefits and potential uses of ANTS and also mentions the practical difficulties ANTS raises concerning performance and security.

Summary

This paper studies active networks by designing and implementing the ANTS active network toolkit. The paper first defines active networks as a new approach to network architecture in which customized code for applications is executed inside the network as opposed to at the end-hosts. They are interesting because of their potential to be used for creating new Internet services, but controversial due to performance and security concerns. In this paper the authors contrast their experience with ANTS with the original vision behind active networks specifically as it pertains to the following three areas: (1) the capsule model of programmability, (2) accessibility of that model, and (3) the applications that can be constructed with that model. The components of ANTS include capsules, which users send along the network like packets, but which contain code that is executed at intermediate nodes along the path, which are programmable routers called active nodes. Capsules implement a custom forwarding routine and so direct themselves and subsequent packets through the network using the routine they implement. The reference implementation of ANTS was written in Java. Any user can develop an application with ANTS, which provides an API for querying node state, routing via shortest paths, and placing data in a temporary soft-store. Once the code is written it is signed by a trusted authority and put in a directory service and registered with a local active node for distribution.

The paper then considers the three areas pertaining to the vision behind active networks in turn. (1) With respect to capsules, the authors argue that it is feasible to carry capsules by reference and load them on demand and that their intrinsic processing overhead is low, but that since node capabilities are heterogeneous, it is important that not all nodes be slowed by capsule processing, resulting in only a partially active network. However, forwarding performance by capsules is very bad. (2) With respect to the question of who can use active networks to introduce new code into the network, security is obviously a major concern. The authors demonstrate that they were able to isolate code and state in ANTS in a way that is similar or better to static protocols used today, but that they handle the problem of global resource allocation using a certification mechanism, which has several drawbacks. (3) Lastly, in terms of services that ANTS can be used to introduce, such services tend to be ones that are typically difficult to deploy, such as multicast, anycast, and explicit congestion notification. ANTS can most compellingly be used for rapid deployment and experimentation, but the authors have yet to find a “killer app” that necessitates the use of ANTS. In order to use ANTS, a service must work under the following constraints: it must be able to be expressed using a restricted API, and it must be compact, fast, and incrementally deployable. The authors offer other observations given their experiences, touching on several points that they note all protocols must deal with in some way. There observations are that ANTS can help to enable systematic changes to protocols in the network, that it handles heterogeneous node capabilities in a clean way, that contrary to common concerns, ANTS does not generally violate the end-to-end argument, and that changes to network services must be localized in order to be easily deployable.

Critique

One thing I liked about ANTS is that it is one solution that at least talks about and attempts to address this issue of changing current Internet protocols and introducing new ones. It is sad how much of networking research never actually gets put into use just because of the difficulty of making changes or additions to the existing architecture. However, while it’s a nice idea, in reality the performance and security issues are too big to imagine ANTS ever being used in reality. One thing that annoyed me about this paper is how several times the author tried to write off the poor performance of ANTS by blaming it on Java, but at the same time credited many of the security benefits to the use of Java. It kind of seemed like the author was trying to have it both ways by saying some performance problems could be overcome by not using Java, but failing to mention that not using Java or something like it may also introduce greater security problems. It was for the best that the paper was written from the perspective of trying to take a closer look at some of the criticisms that people have raised concerning the original vision behind active networks, rather than arguing that ANTS should actually be used in the real world, since the performance and potentially the security problems are just too bad to suggest otherwise. I approve of the author framing the motivation and contribution in the way he did.

Resilient Overly Networks


D. Andersen. H. Balakrishnan, F. Kaashoek, R. Morris, "Resilient Overly Networks," 18th Symposium on Operating Systems Principles, (December 2001).


One line summary: This paper describes a Resilient Overlay Network architecture which is intended to run at the application layer over the existing Internet infrastructure to provide path failure detection and recovery as well as mechanisms for applications to choose their own routing metrics and implement expressive routing policies.

Summary

This paper describes Resilient Overlay Network (RON), an application-layer overlay that exists on top of the existing Internet. The goal of RON is to allow Internet applications to detect and recover from path failures and performance degradation quickly. That is, a RON seeks to (1) provide failure detection and recovery within 20 seconds, (2) integrate routing more tightly with applications by letting applications choose their own routing metrics, and (3) allow expressive routing policies relating to acceptable use polices and rate controls.

A RON consists of a set of nodes deployed at various locations on the Internet and each path between two RON nodes is represented as a single virtual link. An application that communicates with a RON node is a RON client that interacts with the RON node across an API. Each client instantiates a RON router that implements a routing protocol and a RON membership manager that keeps track of the list of RON members. By default, RON maintains information about three metrics for each virtual link: latency, loss rate, and throughput. It obtains this information via probes and shares it with all RON nodes by storing it in a central performance database. It also uses these probes to detect path outages. RON uses the data collected from these probes and stored in the performance database in a link-state routing protocol to build routing tables according to each metric. When it detects a path outage, RON uses single-hop indirection to route over a new path. RON provides policy routing by allowing administrators to define types of traffic allowed on certain links and it implements this policy routing using packet classification methods. RON manages membership via either a static mechanism that reads the list of members from a file or a dynamic mechanism that uses announcement flooding and stores the discovered members in soft-state at each node. One important note about RONs is that they are intended to consist of only a small number of nodes (approximately less than 50) because the many of the mechanisms used will not scale, but RON can still provide the intended benefits with only a small number of nodes.

The authors evaluated RON by deploying two wide-area RONs, one with 12 nodes (RON1) and one with 16 nodes (RON2). They found that RON1 recovered from path outages 100% of the time and RON2 60% of the time. (The lower figure from RON2 is in part due to a site becoming completely unreachable.) RON is able to detect and share information about a virtual link failure in an average of 26 seconds. It does this with an overhead of 30 Kbps of active probing. RON was also able to route around flooding links and improve performance in a small number of cases. The authors were also able to show that RON routes are relatively stable, with the median length of time for a route to persist being about 70 seconds.

Critique

RONs are a somewhat interesting idea and I do think the problem they address – that of allowing applications to recover from path failures because BGP is very slow at doing so and of allowing applications to choose their own routing metrics – is compelling. However, there are several big issues with RONs, the two most obvious being poor performance and inability to scale. With respect to the first, RON’s IP forwarder that the authors implemented could only forward packets at a maximum rate 90 Mbps. Also, RON was only able to select paths with better performance with respect to the default metrics a very small amount of the time. RON also introduced a fair amount of overhead due to its use of flooding. This also relates to its inability to scale. However, I still think many of the ideas behind RONs are useful.

Tuesday, October 27, 2009

On Inferring Autonomous System Relationships in the Internet


L. Gao, "On Inferring Autonomous System Relationships in the Internet," IEEE/ACM Transactions on Networks, V. 9, N. 6, (December 2001), pp. 733-745.


One line summary: This paper provides a heuristic for classifying relationships between autonomous systems in the internet into provider-customer, peering, and sibling relationships and evaluates the inferences of the heuristics using AT&T internal data as well as the WHOIS lookup service.

Summary

This paper proposes a method for classifying relationships between Autonomous Systems (ASs) into provider-customer, peering, and sibling relationships. These relationships are based on routing and commercial contractual relationships between administrative entities. The classification is tested using BGP routing table data collected via RouteViews. It is verified using AT&T internal information about its ASs and their relationship with neighboring ASs, as well as by using the WHOIS lookup service. The paper first reviews BGP routing and policies. BGP speaking routers in an AS enforce import policies to determine which paths to include and use in their routing tables, and export policies to determine which paths to advertise to neighbors. The relationships between an AS and its neighbors determine what export policies the AS enforces. In general, ASs follow the selective export rule, which basically states that an AS will not export its provider or peer routes to its providers or peers and it will export customer, provider, and peer routes with its siblings and customers. This rule leads to the property that the AS path of a BGP routing table entry will be valley free. For a valley free AS path there is an uphill portion of the path that consists only of customer-to-provider or sibling-to-sibling edges and a downhill path that consists only of provider-to-customer or sibling-to-sibling links. The uphill and downhill paths may be connected by a peer-to-peer link. From these definitions it is possible to define maximal uphill and maximal downhill paths. In addition, the uphill top provider is defined as the AS with the highest degree, or largest size, among all the ASs in the maximal uphill path, and the downhill top provider is defined analogously. The overall top provider is the AS with the highest degree among the uphill and downhill top providers.


Given these considerations, the basic heuristic of used to classify edges between ASs is as follows: it looks through the AS path of each BGP routing table entry. It defines the highest degree AS as the top provider. The consecutive AS pairs before the top provider are then either customer-to-provider links or sibling-to-sibling links, and those after the top provider are either provider-to-customer links or sibling-to-sibling links. A refined version of the heuristic attempts to reduce the possibility of incorrect inference from routing misconfigurations by only concluding a transit relationship exists if a certain number of routing table entries imply that. The final version of the algorithm attempts to identify peering relationships by identifying potential peers based on similarity of size (i.e. it assumes the degree of two ASs that have a peering relationship do not differ by more than a certain factor, R, that must be fine-tuned) and concludes that two ASs have a peering relationship if neither of the two provides transit for the other.


The author then compares the inferences of the algorithms with AT&T internal information. The algorithm tends to correctly infer customer relationships correctly most of the time and peering relationships a large part of the time, but tends to incorrectly infer sibling relationships. The author gives several reasons for incorrectly inferring sibling relationships, including router misconfigurations, unusual AS relationships, and heuristic inaccuracy.


Critique

I didn’t like this paper as much as the last one. I thought the author did a nice job of laying out the reasoning behind inferring the AS relationships but it could have been explained in fewer words; I also thought too much space was spent on reviewing BGP routing and policies. The heuristics didn’t do very well and they seemed pretty basic, so work that improves upon these methods would be interesting to see. I question the heuristics’ reliance on the degree or size of an AS as a way of classifying relationships, for instance, in determining peering relationships. And related to this, as the author says, the necessity of a parameter like R for determining how different in size two ASs have to be to be disqualified from consideration as peers is unfortunate. I am curious as to why the heuristic did so poorly in classifying sibling relationships but not the others. Although reasons were given they weren’t entirely satisfying and they could certainly be explored in more depth.


On Power-Law Relationships of the Internet Topology


M. Faloutsos, P. Faloutsos, C. Faloutsos, "On Power-Law Relationships of the Internet Topology," ACM SIGCOMM Conference, (September 1999).


One line summary: This paper provides three power laws that describe fundamental properties of the Internet topology, as well as several novel metrics and methods for estimating them, all of which are useful for developing and analyzing new Internet protocols and creating artificial models of the Internet for simulation purposes.

Summary

In this paper, the authors discover three power laws that describe various graph properties of the Internet topology. According to the authors, these power laws are very useful in three respects: (1) they aid in the design of efficient network protocols that can take advantage of the knowledge of the topological properties of the Internet provided by the power laws, (2) they can help researchers create more accurate artificial models of the Internet for simulation purposes, and (3) they can allow researchers to derive estimates of various topological parameters that will be useful for protocol analysis and speculation about future Internet topology. Power laws are expressions of the form y∝xa, which reads “y is proportional to x to the a,” where a is a constant and x and y are measures of interest. The important observation of the authors’ work is that there is some exponent for each graph instance, although the exponents can change over time.

To derive the power laws, the authors used three Internet instances collected at six-month intervals over a period of one year, 1998. These graphs are at the inter-domain level, where each node is a domain or an AS. They also use another graph from 1995 that is at the router level. The authors note that previously used metrics tend to be based on average, minimum, and maximum values, which are not good descriptors of skewed distributions such as those found in studying Internet topology. They also note that it would be useful to be able to characterize a certain aspect of a graph or plot using just one number, to facilitate comparisons. With this in mind, the authors propose several graph metrics for use in defining their power laws: f(d) is the frequency of outdegree d, or the number of nodes with outdegree d, r is the rank of a node, which is the index in the sequence of all nodes ordered by decreasing outdegree, P(h) is the total number of pairs of nodes within h or fewer hops, including self-pairs and counting all other pairs twice, and NN(h) is the average number of nodes in a neighborhood of h hops. They also define λ as the eigenvalue of the graph and i the order of an eigenvalue.


Given this, we can summarize the authors’ power laws. The first power law (rank exponent) states that the outdegree d of a node is proportional to its rank r to the power of a constant. The authors provide a method for estimating this constant. The difference in this constant between the interdomain graphs and the router graph suggests that it can distinguish between graphs of different types. Intuitively, the authors state that this law captures the trade-off between the gain and cost of adding an edge from a financial and functional point of view. The second power law (outdegree exponent) states that the frequency f of an outdegree d is proportional to the outdegree d to the power of a constant. The value of this constant is practically the same across all graphs which suggests that this law describes a fundamental property of the Internet topology, and that it could be useful for testing the realism of a graph intended to represent Internet topology. This power law indicates that the distribution of outdegree is not arbitrary and that lower degrees are more frequent. The third power law (eigen exponent) states that the eigenvalues of a graph are proportional to their order to the power of a constant. Again, as with the first law, this constant is nearly the same for the interdomain graphs but different for the router graph, suggesting that it could be useful for distinguishing among different types of graphs.


Lastly, the authors also describe their novel way to characterize graph connectivity, stated as an approximation (hop-plot exponent): the total number of pairs of nodes P(h) within h hops is proportional to the number of hops h to the power of a constant. This constant also allows for distinguishing among different families of graphs using a single number, and the authors give several examples. They also show how this approximation is useful for estimating how many hops are required to reach a sufficiently large part of the network. In conjunction with this approximation, the authors also provide definitions for the effective diameter of a graph and for the average neighborhood size. The effective diameter is the distance that any two nodes are from each other with high probability. The average neighborhood size is a common parameter used in analyzing the performance of network protocols, and the authors demonstrate the superiority of their estimation method. Finally, the authors conclude by giving several examples of how these power laws could be useful and by arguing that the power laws must characterize fundamental properties of the Internet due to the statistical unlikelihood of their several graphs obeying these laws simply by chance.

Critique

It was nice change of pace to read a paper with a more theoretical bent. I mostly liked this paper. It would be interesting to try and derive the exponents for the power laws using graphs of today’s Internet and see how it compares to those in this paper or see if the power laws even still hold now, especially considering the trend of content providers directly connecting to customers and circumventing the large backbone providers as we discussed in class. One critique of this paper is that they should have used more data. Another thing I would have liked to see is more visual examples of graphs with the kinds of properties the paper highlights via the power laws.

Sunday, October 18, 2009

White Space Networking with Wi-Fi like Connectivity


P. Bahl, R. Chandra, T. Moscibroda, R. Murty, M. Welsh, "White Space Networking with Wi-Fi like Connectivity", ACM SIGCOMM Conference, (August 2009).

One line summary: This paper describes a system for UHF white space wireless networking called WhiteFi that addresses the three main challenges unique to white space networking: spatial variation, temporal variation, and spectrum fragmentation in the available white space channels.

Summary

This paper presents a system for UHF white space wireless networking called WhiteFi. White spaces are the unused portions of the UHF spectrum that have recently been opened for use by unlicensed devices subject to the constraint that such devices not interfere with incumbent uses such as TVs broadcasts and wireless microphone transmissions. As a consequence of this, white space networks differ in three major ways from traditional wireless networks: spatial variation, spectrum fragmentation, and temporal variation. White space networking involves spatial variation in the availability of portions of the white space spectrum because the presence of incumbents varies over the wide area as well as on a finer scale. It involves spectrum fragmentation due to the presence of incumbents occupying certain UHF channels; fragmentation varies across area and this implies the need for using variable channel widths. Lastly, white space networking involves temporal variation largely because of the use of wireless microphones, which can be turned on and off at any time, and white space devices must switch channels when they detect a wireless microphone in that channel.

WhiteFi is supported on the KNOWS hardware platform, which consists of a PC, a scanner, and a UHF translator. Two key features of this platform are support for variable channel width use and a primary user signal detection algorithm called Signal Inspection before Fourier Transform (SIFT). Three key components of WhiteFi that build upon this are a novel spectrum assignment algorithm, SIFT for discovering white space wireless access points (APs), and a chirping protocol that permits indication of disconnection from a channel due to the appearance of an incumbent, without interfering with the incumbent. The spectrum assignment algorithm is adaptive and client-aware, picking a channel and a channel width that is free for all clients. It uses a spectrum map to indicate the presence of incumbents, an airtime utilization map to indicate the degree of utilization of each UHF channel, and control messages between the clients and AP containing these maps to share the necessary information. It uses this information along with channel probes to compute the multichannel airtime metric (MCham), which is roughly a measure of the aggregate bandwidth of given selection, and which it uses as the basic for channel selection. AP signals are detected by sampling bands of the UHF spectrum using SDR and performing an efficient time-domain analysis of the raw signal using SIFT. Sudden disconnections due to the appearance of an incumbent on a channel that an AP-client pair is using for communication is dealt with using a chirping protocol, which involves sending beacons about the white spaces now available over a backup channel.

In the evaluation of WhiteFi, several things are demonstrated. The first is that SIFT accurately detects packets over varying channel widths even with high signal attenuation, missing in the worst case at most 2%. The second is that the AP discovery algorithms are effective. The next is that WhiteFi correctly handles disconnections using its chirping protocol. The last is that WhiteFi’s channel selection algorithm adapts quickly and makes near-optimal selections to operate on the best available part of the spectrum. This last point is demonstrated via simulation.

Critique

This was one of my favorite papers that we’ve read so far (along with the classics). To me at least this seemed like an awesome feat of engineering and it’s obviously very useful technology so and exciting to think of it becoming available. I actually quite enjoyed reading it. Obviously there will be a lot of interesting work in the future that will build off of what they did here. I checked to confirm my suspicion that it won Best Paper Award at SIGCOMM this year so I like to think that somewhat legitimizes my raving about it.

Saturday, October 17, 2009

Interactive WiFi Connectivity For Moving Vehicles


A. Balasubramanian, R. Mahajan, A. Venkataramani, B. N. Levine, J. Zahorjan, "Interactive WiFi Connectivity For Moving Vehicles", ACM SIGCOMM Conference, (August 2008).


One line summary: This paper describes a protocol for providing WiFi connectivity to moving vehicles called ViFi.

Summary

This paper introduces ViFi, a protocol for providing WiFi connectivity to moving vehicles with minimal disruptions. ViFi achieves this by allowing a vehicle to use several base stations (BSes) simultaneously so that handoffs are not hard handoffs. The paper begins by evaluating six different types types of handoff policies on their VanLAN testbed at the Microsoft Redmond campus: (1) RSSI—the client associates with the base station with the highest received signal strength indicator, (2) BRR—the client associate with the base station with the highest beacon reception ratio, (3) Sticky—the client doesn’t disassociate with a base station until connectivity is absent for some defined period of time, then chooses the one with the highest signal strength, (4) History—the client associates with the base station that has historically provided the best performance at that location, (5) BestBS—the client associates, in a given second, with the base station that will provide the best performance in the next second, and (6) AllBSes—the client opportunistically uses all base stations in the vicinity. The first five policies use hard handoffs. The last is an idealized method that represents an upper bound. The metrics used to judge these policies are aggregate performance and periods of uninterrupted connectivity. Sticky performs the worst, and RSSI, BRR, and History perform similarly. AllBSes is by far the best. They also observe gray periods in which connection quality drops sharply and unpredictably.

The authors explain that the effectiveness of AllBSes stems from the fact that the vehicle is often in the range of multiple BSes and packet losses are bursty and roughly independent across senders and receivers. They design ViFi as a practical version of AllBSes. In ViFi, there is an anchor base station and a set of auxiliary base stations. The identities of these are embedded in the beacons that the vehicle broadcasts periodically, along with the identity of the previous anchor. An auxiliary base station attempts to relay packets that have not been received by the destination. This is done with according to an estimated probability that is computed according to the following guidelines: (1) relaying decisions made by other potentially relaying auxiliaries must be accounted for, (2) auxiliaries with better connectivity to the base station should be preferred, and (3) the expected number of relayed transmissions should be limited. ViFi also allows for salvaging, in which new anchors contact the previous anchor for packets that were supposed to be delivered to the vehicle but weren’t able to be delivered before the vehicle moved out of range. ViFi uses broadcasts and implements its own retransmits. It doesn’t implement exponential backoff because it assumes collision will not be a problem. To evaluate ViFi the authors deploy it on VanLAN and do a trace-driven simulation using measurements from another testbed called DieselNet. The most important findings according to the authors are: (1) the link layer performance of ViFi is close to ideal, (2) ViFi improves application performance twofold over current handoff methods, (3) ViFi places little additional load on the vehicle-base station medium, and (4) ViFi’s mechanism for coordinating relays has low false positive and false negative rates.

Critique

I didn’t find this paper very exciting but I still thought there were a lot of good things about it. The idea behind ViFi isn’t all that novel since it has been used in cellular networks. I liked that in this paper the authors first examined a variety of techniques and gave results for those. One thing I am confused about is how they implemented AllBSes to test it considering they argued that it is impractical to implement and represents the ideal. In general, the paper seemed to have a lot of evaluation in it, which I liked. Although ViFi did perform well in the settings in which it was tested, overall, I’m not sure how practically useful it is. For example, they assume that collisions are not an issue. If ViFi were to become successful and widely deployed it’s possible that collisions might become a problem, and they don’t even attempt to consider situations in which there might be collisions. Also, I can only imagine people wanting to connect to the Internet if they are going fairly long distances but as the authors mention in the paper, WiFi deployments may not be broad enough and lack of coverage between islands may make it unusable. There are several other deployment challenges too, and not just those mentioned in the paper. I just can’t imagine it being desirable in many settings. I don’t see cities wanting to invest in this. I’m not sure you could get a lot of other people besides the city government to freely provide access. It could be useful to deploy many ViFi base stations along a commuter rail line though, but in that case, since you know the speed, location, and schedule of the moving vehicles you could probably do even better than ViFi.

Thursday, October 15, 2009

XORs in the Air: Practical Wireless Network Coding


S. Katti, H. Rahuk, W. Hu, D. Katabi, M. Medard, J. Crowcroft, "XORs in the Air: Practical Wireless Network Coding," ACM SIGCOMM Conference, (September 2006).

One line summary: This paper describes a coding mechanism for improving throughput in wireless networks called COPE; COPE relies on XORing packets together and is able to improve overall network throughput in many cases.

Summary

This paper presents a coding mechanism for improving throughput in wireless networks that sits as a shim between the IP and MAC layers. This mechanism is called COPE. Two key design principles of COPE are that it embraces the broadcast nature of the wireless channel and it employs network coding. COPE employs three main techniques: opportunistic listening, opportunistic coding, and learning neighbor state. Basically, the way COPE works is that all packets are broadcast. When a node overhears a broadcast packet it stores it for a short time. Each node also shares with its neighbors the packets it has overheard. When a node needs to transmit, it encodes several packets together by XORing them if each of the packets’ intended recipients has enough information to decode the XORed packet. This is the case for a given recipient if the intended recipient has overheard n-1 of the n packets that have been XORed together, and the nth packet is the one destined for the intended recipient.

The packet coding algorithm of COPE required several design decisions: (1) COPE never delays packets, (2) COPE gives preference to XORing packets of the same length, (3) COPE maintains two virtual queues (one for big packets and one for little pakcets) per neighbor to aid in making coding decisions, (4) COPE makes sure packets do not get reordered, and (5) in COPE each node tries to ensure that each neighbor to whom an XORed packet is headed has a high probability of decoding the packets within. COPE uses pseudo-broadcast because real broadcast lacks reliability and backoff upon collision. It also uses hop-by-hop and asynchronous ACKs and retransmissions. The authors define two measures by which to quantify the throughput gains of COPE. The first is the coding gain and it is defined as the ratio of the number of transmissions required by a non-coding approach to the minimum number of transmissions required by COPE to deliver the same set of packets. The second is the coding+MAC gain and occurs due to COPE’s beneficial interaction with the MAC. For topologies with a single bottleneck, for example, this is defined as the ratio of the bottleneck’s draining rate with COPE to its draining rate without COPE.

The paper gives a nice flowchart that succinctly shows the operation of the sender and receiver operation in COPE. It also describes their evaluations. The most important findings from the evaluations are as follows: (1) when the wireless medium is congested and the transport layer does not perform congestion control (e.g. UDP) COPE delivers 3-4x improvement in throughput, (2) in this case COPE’s throughput exceeds the expected coding gain and agrees with the coding+MAC gain, (3) when the network is connected to the Internet via a gateway, the throughput improvement from COPE depends on the amount and diversity of upload traffic and ranges from 5% to 70%, and finally (4) hidden terminals in the network create a high loss rate which causes TCP to underutilize the medium and not create enough coding opportunities, so there is no improvement with COPE; when there are no hidden terminals and TCP is used as the transport protocol, the throughput of COPE agrees with the expect coding gain.

Critique

This paper struck me as one of those ideas that seem obvious after you read it. That is usually a good thing, at least in my mind. One thing I thought was interesting is the choice to use pseudo-broadcast instead of just implementing their own ACKs and backoff as has been done in other papers. One bad thing about COPE is its need to order packets at the destination. The authors state this is necessary because COPE reorders packets and this can hurt the performance of TCP but one result of this modification is that other causes of reordering are masked. This is probably not important though. It’s disappointing that COPE doesn’t really provide much improvement when used with TCP. One thing I really wondered about is why in the evaluation of COPE in a network that is connected to the Internet via one or more gateways, the throughput improvement depended on the amount of uplink traffic. I’m not sure I understand why this is the case and I thought they could have explained it a little more fully. Perhaps when there is a higher uplink to downlink ratio there are more coding opportunities. If it is true that more uplink traffic results in improved throughput then in real world situations the throughput gains are likely to be disappointing, because in at least one other paper that we read for this class, it is said that in general there is less uplink to the Internet than downlink traffic from the Internet. One other thing is that for their metrics they used aggregate measures such as total network throughput and it might have been nice to see average per-flow throughput as well.

ExOR: Opportunistic Multi-Hop Routing for Wireless Networks


S. Biswas, R. Morris, "ExOR: Opportunistic Multi-Hop Routing for Wireless Networks," ACM SIGCOMM Conference, (August 2005).

One line summary: This paper describes a wireless routing protocol called ExOR that is inspired by cooperative diversity schemes and that achieves significant improvement in throughput over traditional routing protocols.

Summary

This paper describes a wireless routing and MAC protocol called ExOR. In contrast to traditional routing protocols that choose a sequence of nodes along a path to route through and then forward packets through each node on that path in order, ExOR is inspired by cooperative diversity routing schemes, and broadcasts each packet to a list of candidate forwarders from which the best node to receive the packets forwards them in turn. The authors demonstrate that ExOR substantial increase in throughput over traditional routing protocols. One reason for this is that each transmission in ExOR may have a more chances of being received and forwarded. Another reason is that it takes advantage of transmissions that go unexpectedly far or that fall short. Four key design challenges for ExOR were (1) nodes must agree on which subset of them received each packet, (2) the closest node to the destination to receive a packet should be the one to forward it, (3) there is a penalty to using too many nodes as forwarders since the cost of agreement go up, and (4) simultaneous transmissions must be avoided to prevent collisions.

Briefly, the way ExOR works is as follows. The source batches packets destined to the same host and broadcasts the batch. Each packet contains a batch map that indicates the highest priority node to have received a packet. Each packet also contains a forwarder list of nodes that is in order of increasing cost of delivering a packet from that node to the destination. The cost metric used is similar to ETX. A node that receives a packet checks the forwarder list for itself. It checks the batch map in the packet and uses it to replace its local batch map if it indicates a higher priority node has received it. Nodes forward all remaining packets that have not been yet received by the destination and the nodes do this in the order in which they appear in the forwarder list. Nodes estimate the time they need to wait before sending or use a default value if they have no basis for guessing. Once the last node in the forwarder list has transmitted, the source starts the process again by broadcasting all packets not yet received by any node. The destination sends copies of its batch map to propagate information back to the sender about which packets were received. Nodes only continue transmitting until they receive indication that 90% of the batch has been received; the rest is then forwarded using traditional routing.

ExOR was evaluated on Roofnet. The authors measure the throughput obtained when transmitting 1MB using traditional routing and ExOR between each of 65 pairs of nodes. They compare the 25 highest and 25 lowest throughput pairs (with respect to traditional routing). ExOR outperforms traditional routing even over just one hop. They also show that ExOR had to retransmit packets on average half as many times as the traditional routing protocol did. They examine the performance of ExOR using various batch sizes and conclude the optimal size is likely between 10 and 100 packets. They use a simulator to study the effects of shared interference on the performance of ExOR and conclude that it only slightly hurts ExOR’s performance. Lastly they show that the throughput of ExOR exhibits less variance than that of traditional routing.

Critique

I liked the ExOR protocol because I thought it was unusual and clever and it did get much better throughput than traditional routing. I thought it was especially interesting that it got better throughput even over one hop. I also liked how in the evaluation section, the authors broke their results down to the 25 highest and lowest throughput pairs because it was informative and a nice way to think about the results.

All that said, here are some things I didn’t like about this paper. The authors assume that the reception at different nodes is independent and that there is a gradual falloff in delivery probability with distance, and the performance of ExOR in part depends upon these assumptions, but it seems these could be looked at more closely. I thought their simulation might have been too oversimplified to the extent that I’m not convinced it really gave much useful information. Another thing is that it seems like the nodes have to maintain a lot of state (delivery probability for all pairs, unless I’m misunderstanding something). Also, again as in some other papers from these authors, they ran their evaluations while other traffic was running over their Roofnet testbed. This just seems bad to me because I thought scientists were supposed to control as many variables as possible so that their experiments are repeatable. Another thing regarding their evaluations is that they had the traditional routing protocol send the entire file to the next hop before the next node starts sending. They claim this is more fair, and that may indeed be the case, but since I don’t think this is what traditional routing normally does it seems like they probably should have tried it both ways to verify that the modified approach indeed does give traditional routing better performance.

Overall though I liked this paper and I think it should stay in the syllabus.

Thursday, October 8, 2009

A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols


J. Broch, D. Maltz, D. Johnson, Y-C Hu, J. Jetcheva, "A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols," ACM Mobicom Conference, (October 1998).

One line summary: This paper uses a network simulator that the authors improved by adding relatively realistic physical and spatial models to compare four wireless ad hoc network protocols: DSDV, TORA, DSR, and AODV, that cover a wide range of different design choices.

Summary


This paper compares four wireless ad hoc network routing protocols using detailed simulations. The four protocols compared were (1) Destination-Sequenced Distance Vector using sequence number triggered updates (DSDV-SQ), (2) Temporally Ordered Routing Algorithm (TORA), (3) Dynamic Source Routing (DSR), and (4) Ad Hoc On-Demand Distance Vector using link layer feedback to detect link breakage (AODV-LL). DSDV-SQ is a hop-by-hop distance vector routing protocol that uses periodic broadcast updates and guarantees loop freedom. TORA is a distributed routing protocol that uses a link-reversal algorithm to provide on demand route discovery and runs on top of IMEP. DSR uses on-demand source routing and consists of Route Discovery and Route Maintenance phases. Lastly, AODV-LL is like a combination of DSR and DSDV because it has on-demand Route Discovery and Route Maintenance along with hop-by-hop routing and sequence numbers. These protocols were simulated using ns-2 at varying levels of node mobility and number of senders. The simulator was enhanced to allow for realistic physical layer and propagation modeling as well as node mobility. It used a random waypoint model to simulate mobility and constant bit rate (CBR) sources. The metrics by which the protocols were judged were (1) packet delivery ratio, (2) routing overhead, and (3) path optimality.

The results are presented for each metric. These results are for hosts that move at a speed of 20 m/s when not paused. In terms of the percentage of packets delivered, DSR and AODV-LL deliver between 95% and 100% of packets regardless of offered load and node mobility. DSDV-SQ drops to 70% packet delivery with constant node mobility, which the authors attribute to packets dropped on account of a stale routing table entry. TORA does well until the number of sources sending packets reaches 30 then experiences congestive collapse due to a positive feedback loop. In terms of routing overhead, DSR and AODV-LL have similar curves although AODV-LL has higher overhead when node mobility is near constant. The authors later note that if measured in bytes instead of packets, the overhead of DSR becomes much greater than AODV-LL. The overhead of TORA depends in part on node mobility and is much higher than any of the other protocols. DSDV-SQ has nearly constant overhead independent of node mobility or offered load. In terms of path optimality, both DSDV-SQ and DSR use near optimal routes regardless of node mobility, whereas TORA and AODV-LL use less optimal routes when node mobility is high. Other interesting observations the authors make are that the percentage of packets successfully delivered for broadcast packets is lower than for unicast packets. They also note that early in their experiments they found a serious integration problem with ARP; they found a workaround but note that this problem would have to be addressed in any real implementation running on top of ARP. The authors don’t really conclusively rank the protocols in the end but it is clear from the experiments that DSR is probably the best protocol, followed by AODV-LL and DSDV, with the relative ranking of these two being less clear due to tradeoffs. TORA is obviously the worst protocol in almost every respect.

Critique

Although I tend to think that real-world experiments are always preferable to simulations, I like that the authors improved the network simulator to include a spatial model to simulate host mobility and a relatively realistic physical layer and radio network interface model. I also appreciated the thoroughness of their simulations with respect to their clearly stated metrics. Their section describing their experimental results was considerably easier to follow than most papers and the way they laid out their graphs made it pretty easy to compare the results from the different protocols. I also like that they provided additional, more in-depth explanations for certain observations where they were warranted, for example, their explanation of the congestive collapse of TORA. Their section containing additional observations was nice too. For some reason I am having trouble questioning their assumptions and making criticisms of this paper (nothing immediately jumps out at me), but because they used a simulator and also implemented all the protocols themselves there are clearly going to be a lot of assumptions that underlie their results. I thought they did a pretty good job of clearly stating what all these assumptions were though, so at least they are there for readers to take into account. I liked this paper and I think it’s probably good to keep it in the syllabus.

A High-Throughput Path Metric for Multi-Hop Wireless Routing


D. De Couto, D. Aguayo, J. Bicket, R. Morris, "A High Throughput Path Metric for Multi-Hop Wireless Routing," ACM Mobicom Conference, (September 2003).


One line summary: This paper presented ETX, a new routing metric for wireless networks that attempts to maximize path throughput; this paper also explains why min hop count is a poor metric for use in wireless networks and performs an evaluation comparing min hop count with ETX.

Summary

This paper presented a new metric for routing in wireless networks called the expected transmission count (ETX). It compares ETX with the most commonly used metric: minimum hop count, which is implicitly based on the assumptions that links either work well or don’t work at all. The authors give several reasons why, as a result, routing protocols that use min hop count as their metric for selecting paths achieve poor performance in wireless networks. They further quantify these reasons via evaluation of the min hop count metric in a test bed. One reason min hop count performs poorly is that by minimizing the number of hops, it maximizes the distance traveled in each hop, which is likely to also minimize the signal strength and maximize the loss ratio. Min hop count performs well when the shortest path is also the fastest path, but that is frequently not the case in wireless networks. When there are a number of paths with the same minimum hop count, routing protocols often choose one at random, and this is unlikely to be the best choice. Another issue is that min hop count does not deal well with asymmetric links, which are common in wireless networks. Lastly, min hop count does not take into account the wide range of loss ratios for links in wireless networks.

Given these considerations, the authors state that ETX must be able to account for the wide range of link loss ratios, the existence of asymmetric links, and the interference between hops in a multi-hop path. ETX is designed to maximize throughput. The ETX of a path is the sum of the ETX of each link in the path. The ETX of a link is defined as one over the expected probability that a transmission over that link is successfully received and acknowledged. This probability is calculated using the delivery ratios in both forward and reverse directions over the link. Delivery ratios are calculated using probe packets. Five important characteristics of ETX according to the authors are (1) it is based on delivery ratios, (2) it detects and handles asymmetry, (3) it uses precise link loss ratio measurements, (4) it penalizes routes with more hops, and (5) it minimizes spectrum use. Some drawbacks of ETX are that it only makes sense for networks with link layer retransmissions, it assumes radios have a fixed transmit power, it is susceptible to problems due to MAC unfairness under high load, when the highest throughput path has more than three hops it might not choose that path, and lastly, it does not account for mobility.

ETX was evaluated by implementing two routing protocols, DSDV and DSR, to use it as the routing metric and compare with those same protocols using min hop count as the routing metric. Some conclusions they draw with respect to DSDV are that ETX performs better than min hop count especially when min hop count uses paths with asymmetric links, ETX incurs more overhead than min hop count, ETX overestimates the delivery ratio for large data packets and underestimates it for small ACKs, ETX outperforms DSDV with a handshaking scheme, and ETX performance improves when a certain delay-use modification is used. With respect to DSR, they found that link failure feedback allows DSR with min hop count to perform as well as DSR with ETX.

Critique

I didn’t like this paper as well as I liked the Roofnet paper. I don’t think that ETX turned out to be as impressive in the evaluation section as the authors made it sound in the initial sections, so that was kind of disappointing. In the Roofnet paper, they don’t actually use ETX directly, but use a more sophisticated metric called ETT. I thought the authors’ explanations in the beginning of the paper for why min hop count performs poorly in wireless networks were nice. I wasn’t as impressed with the rest of it. I thought it was interesting that DSR with min hop count and link failure notification allowed DSR to perform pretty much as well as DSR with ETX. I wonder how ETX would compare with other tougher competitors beyond just the naïve min hop count metric. ETX clearly has some unsatisfactory features, including the overhead of using it with DSDV and its tendency to misestimate the delivery ratio for packets that are much smaller or much larger than probe packets. Also, some other things about the evaluation section confused me. For instance, their explanation of why you shouldn’t compare runs was confusing. It seems that they used entirely different parameters (packet size, transmit power) in different runs according to the labels on the graphs, but then they say you shouldn’t compare them because the network conditions change over time, not because the parameters are different. Even though I am not very enthusiastic about this paper and ETX didn’t turn out to be as great as the authors imply at the beginning of the paper, there is probably still a lot of interesting things to learn from in here, things to do as well as things not to do. The authors clearly learned from this experience because they created an improved metric, ETT, for Roofnet.

Sunday, October 4, 2009

Architecture and Evaluation of an Unplanned 802.11b Mesh Network


J. Bicket, D. Aguayo, S. Biswas, R. Morris, "Architecture and Evaluation of an Unplanned 802.11b Mesh Network," ACM Mobicom Conference, (September 2005).

One line summary: This paper describes Roofnet, an unplanned wireless mesh network built with 37 nodes over an area of about 4 square km; this paper presents its design and implementation as well as an evaluation of the actual network constructed.

Summary

This paper discusses the design and evaluation of an unplanned community wireless mesh network called Roofnet. Community wireless networks are usually constructed in one of two ways: by constructing a planned multi-hop network with nodes in chosen locations and directional antennas or by operating hot-spot access points to which clients directly connect. Roofnet aims to combine the advantages of both approaches by employing the following design decisions: (1) unconstrained node placement, (2) omni-directional antennas, (3) multi-hop routing, and (4) optimization for routing in a slowly changing environment with many links of varying quality.

Roofnet consists of 37 nodes spread across about four square km. Node locations are neither random nor truly planned. Roofnet provides Internet access. The nodes are self-configuring. Roofnet uses its own set of IP addresses on top of the IP layer. Each node runs a DHCP server and provides NAT to the hosts connected to it. If a Roofnet node can connect directly to the Internet it acts as a gateway to the rest of Roofnet. At the time the paper was written Roofnet had four gateways. Roofnet uses its own routing protocol called Srcr. It uses source routing and attempts to use the highest throughput routes using Dijkstra’s algorithm. The routing metric used in lieu of exact information about the throughput of routes is the estimated transmission time (ETT), which is a prediction of the amount of time it would take a packet to traverse a route given each links’ transmit bit rate and the delivery probability at that bit rate. Nodes choose among the available 802.11b transmit bit rates using an algorithm called SampleRate, which attempts to send packets at the bit rate which will provide the most throughput.

Roofnet was evaluated using four sets of measurements: (1) multi-hop TCP, (2) single-hop TCP, (3) loss matrix, and (4) multi-hop density. Some simulation was also used. A brief overview of their findings follows. Roofnet’s one-hop routes have a speed consistent with the 5.5 Mb transmission rate but longer routes are slower. That is, throughput decreases with each hop faster than might be expected. The authors speculate that this is due collisions of concurrent transmissions. The maximum number of hops to a gateway is 5. Roofnet’s routing algorithm Srcr prefers short, fast hops. The median delivery probability of the links used by Srcr is 80%. Roofnet approaches all-pairs connectivity with more than 20 nodes and as the number of nodes increases, throughput increases. The majority of Roofnet nodes route through more than two neighbors for their first hop, suggesting the network makes good use of the mesh topology. The best few links in Roofnet contribute considerably to overall throughput but dozens of nodes must be eliminated before throughput drops by half. Fast links are more important for throughput but long and fast links are more important for connectivity. In Roofnet, if only single-hop routing is used, five gateways are needed to cover all nodes. For five or fewer gateways, randomly chosen multi-hop gateways are better than randomly chosen single-hop gateways, but for larger number of gateways carefully chosen single-hop ones are better. The authors also examined one 24-hour period of use of the Roofnet network by their volunteer users, monitoring one gateway. They found that 94% of the traffic through the gateway was data and the rest was control packets. 48% of the traffic was from nodes one-hop away and 36% from nodes two-hops away and the rest were three-hops away or more. Almost all of the packets through the gateway were TCP. Web requests made up 7% of the data transferred and BitTorrent made up 30%, although 67% of the connections were web connections and only 3% were BitTorrent.

Critique

I thought this paper was very interesting. I thought the results of their evaluations were so in particular. However, I don’t think it is good that they allowed users to use Roofnet while they were doing their experiments. It seems like this would definitely have some affect but one that is difficult to quantify and explain. This seems especially true considering they had not an insignificant amount of BitTorrent traffic on their network. Also I think they should have explained how they did their simulations a bit more. I wasn’t entirely clear on that point; for instance, in these simulations, did they still let SampleRate determine what at what rate to transmit? Since SampleRate adjusts over time, I am not sure what this means for their simulations. Despite these things I still liked their evaluations. I thought they selected interesting aspects of Roofnet to examine and their results were presented in a very nice and clear way.

I think that the design of Roofnet itself is pretty cool. Also I often tend to prefer papers that talk about things actually built as opposed to just simulated. Their routing algorithm is clever although they point out that it may not be scalable. One thing that confused me is that in the section on addressing, they explain that each Roofnet node assigns itself an address from an unused class-A address block. They say these addresses are only meaningful within Roofnet and are not globally routable, so I wonder if they need to be unused addresses. If they do, that’s obviously a huge constraint, and if they don’t, I’m not sure why they state they are unused but don’t explain that they don’t need to be unused. I may be misunderstanding something very basic there, but if not, they may be leaving something out.

In summary, this paper was fun to read and I think it should stay in the syllabus.

Modeling Wireless Links for Transport Protocols


Andrei Gurtov, Sally Floyd, "Modeling Wireless Links for Transport Protocols," ACM SIGCOMM Computer Communications Review, Volume 34, Number 2, (April 2004).

One line summary: This paper discusses modeling wireless links, including problems with current models and how the assumptions of a model can affect the evaluation of transport protocols using that model.

Summary

This paper discusses modeling wireless links especially for the purpose of evaluating transport protocols. The paper first briefly describes the three main types of wireless links: cellular, wireless LANs, and satellite links. It then discusses common topologies found with wireless links, as well as the typical kinds of traffic that run over them. It states that common performance metrics for wireless links include throughput, delay, fairness, dynamics, and goodput. The paper goes on to give four reasons why better models are needed, along with support examples for each reason: (1) some current models are not realistic, (2) some current models are realistic but explore only a small fraction of the parameter search space, (3) some current models are overly realistic, and (4) many models are not reproducible. The paper then describes specific characteristics of wireless links: error losses and corruption, delay variation, packet reordering, on-demand resource allocation, bandwidth variation, and asymmetry in bandwidth and latency. It also discusses the effect of queue management and node mobility on transport protocols. The paper then argues that it is not necessarily the case that transport protocols must adapt to wireless links or that wireless link layer protocols must accommodate all transport protocols but rather that designers of each should take into account the characteristics of the other and their interplay. The paper discusses three particular areas where it is not clear how link layer and transport layer protocols should interact: bit error detection and correction, packet reordering, delay variation, and cross-communication between layers.

Critique

I didn’t really like this paper. It’s not that I would disagree with anything the authors said. Many of the things they pointed out are well known and they say so. It is useful that they aggregated all this information and presented it in a nicely organized and logical way. It was somewhat informative. I guess I’m just surprised that it was published at a conference when they didn’t really implement anything or prove anything or even have that many results. I would more expect to find this in a textbook or something.

Thursday, October 1, 2009

A Comparison of Mechanisms for Improving TCP Performance over Wireless Links


H. Balakrishnan, V. Padmanabhan, S. Seshan, R. H. Katz, "A Comparison of Mechanisms for Improving TCP Performance over Wireless Links," IEEE/ACM Transactions on Networking, (December 1997).


One line summary: This paper compares several end-to-end, link layer, and split connection schemes for improving the performance of TCP over wireless links, and concludes that link layer schemes incorporating TCP awareness and selective acknowledgments, as well as end-to-end schemes using Explicit Loss Notification and selective acknowledgments, were the most promising.

Summary

This paper compares a number of schemes for improving TCP performance in wireless networks using an experimental testbed consisting of a wired and a wireless link. The approaches compared fall under three categories: end-to-end, link layer, and split connection. In the end-to-end category, six specific protocols were tested. These included the de facto TCP implementation, TCP Reno (labeled E2E), TCP New Reno (labeled E2E-NEWRENO), which improves upon the performance of TCP Reno by remaining in fast recovery after a partial acknowledgement to recover from losses at the rate of one packet per RTT, two TCP schemes using selective acknowledgments (SACKs), labeled E2E-SMART and E2E-IETF-SACK, and two schemes using Explicit Loss Notification (ELN), labeled E2E-ELN and E2E-ELN-RXMT. There were four link layer schemes tested. These included a base link layer algorithm, LL, a link layer algorithm that uses SACKs, labeled LL-SMART, and two additional schemes that add to the first two TCP awareness, labeled LL-TCP-AWARE and LL-SMART-TCP-AWARE. Two split connection schemes were tested, one labeled SPLIT modeled after I-TCP, and another that adds SACKs to the first, called SPLIT-SMART.

The authors perform simple initial experiments on the mechanisms in each category. Of the link layer schemes, the authors conclude that the ones that use knowledge of TCP semantics perform the best. Of the end-to-end protocols, the results suggest that TCP New Reno is better than TCP Reno and that a scheme that makes use of both SACKs and ELN would probably perform the best. Of the split-connection schemes, SPLIT-SMART is better, but the performance of these schemes is overall not as good as those in the other categories. Using the most promising schemes from their initial experiments, the authors then test these schemes’ reaction to burst errors and performance at different error rates. Overall, the authors drew four main conclusions: (1) of all the schemes investigated, a link layer protocol with TCP awareness and selective acknowledgements performs the best, (2) splitting the end-to-end is not necessary for good performance and violates the end-to-end semantics, (3) selective acknowledgements are especially useful over lossy links when losses occur in bursts, and (4) end-to-end schemes are not as good as link layer schemes but they are promising because they do not require any support from the link layer or intermediate nodes.

Critique

The comparisons and overviews of the different techniques for improving TCP over wireless were informative. One thing that is a bit questionable is that the authors themselves implemented each of the protocols. The reason I mention this is because in class I remember Prof. Katz telling us about a company that claimed to be able to distinguish between different implementations of TCP because they each had unique signatures of sorts in the way they operated. This suggests that two different implementations of the same protocol aren’t necessarily the same. It may have been more of a fair comparison to use the “standard implementation” of each protocol but if this was not possible in all cases then implementing them all is probably better anyway. On second thought, this method still works for the sake of comparing techniques, if not implementations, and you probably wouldn’t want the results to be implementation dependent anyway. This paper was the first time I had heard about split-connection protocols, and I have to say, they don’t seem like a very good idea at all, but that could have been because of the way it was presented and tested. It confused me that they still used TCP over the wireless link using the split-connection schemes. That didn’t seem very useful. One other thing that confused me is why at one point in a footnote, the paper states that they only measure “receiver throughput”, when at an earlier point in the paper they state they measure “throughput at the receiver”. I think that in this case they intended the same meaning by these two phrases but generally I would think that these two phrases don’t mean the same thing. I also found the charts not so great to read. Lastly, I think an interesting thing to study next would be a comparison of these protocols over more complex wireless topologies, as they mention in the section on future work.

MACAW: A Media Access Protocol for Wireless LANs


V. Bharghaven, A. Demers, S. Shenker, L. Zhang, "MACAW: A Media Access Protocol for Wireless LANs," ACM SIGCOMM Conference, (August 1994).

One line summary: This paper introduces a wireless media access protocol called MACAW that builds off of prior protocols; MACAW is based on the use of RTS, CTS, DS, ACK, and RRTS messages and backoff algorithms for when contention is detected.

Summary

This paper discusses modifications to a previous wireless access protocol called MACA to alleviate problems with performance and fairness. The result of these modifications is MACAW. MACAW is designed based on four key principles: (1) relevant media contention is at the receiver, not the sender, (2) congestion is location dependant and is not a homogonous phenomenon, (3) learning about congestion must be a collective enterprise among devices, and (4) synchronization information should be propagated so all devices can contend fairly and effectively. MACAW was designed and tested with consideration to collision but not capture or interference – it was designed to tolerate noise but mostly tested in a noise-free setting.

This paper first describes problems with the CSMA and MACA protocols. It then describes the ways in which MACA was improved to yield MACAW. The basic principles behind MACA are Request-To-Send (RTS) and Clear-To-Send (CTS) messages, along with a backoff algorithm. The first thing MACAW does is to adjust the backoff algorithm to use multiplicative-increase linear-decrease (MILD) as opposed to binary exponential backoff (BEB) to reduce oscillation in backoff times, and adds backoff copying to prevent starvation. MACAW then adds the notion of keeping separate queues for each stream sent as opposed to one queue for the sender as a whole to achieve a sort of intuitive fairness (not precisely defined here) across flows. It then introduces four changes to the RTS-CTS message exchange algorithm. These are (1) acknowledgment (ACK) messages to avoid transport layer retransmission backoffs, (2) Data-Sending (DS) messages to indicate the beginning of data transmission, (3) Request-for-Request-to-Send (RRTS) to allow a receiver to contend on behalf of a sender when the receiver was previously unable to respond to an RTS due to contention, and (4) a procedure for multicast. MACAW then again refines the backoff algorithm by making each station maintain a backoff counter for each stream that reflects congestion at both the sender and the receiver.

All evaluation of MACAW was done via simulation. The authors mention several design options they chose not to implement, including piggy-backed ACKs, NACKs, and the use of carrier-sense or intermediate options involving clean signals, and they also describe alternative mechanisms that are entirely different from MACAW, such as token-based schemes and polling or reservations. They admit there are problems that arise in certain situations that they have not solved and their solution in the multicast case is not good. They also admit that they don’t have a precise definition of fairness.

Critique

I liked this paper’s candidness about the shortcomings of their approach. I did not like that they performed all evaluations using simulations instead of attempting a real-world evaluation. It would have been nice if they could have implemented some of the design alternatives mentioned, such as NACKs. I would think that since they are only using simulations that this would have been relatively easy to do. I wish they would have said what their ideas were for dealing with the multicast problem better. I’m sure they had some ideas. One obvious idea is to have the sender send a special Multicast-Request-To-Send (MRTS) message to each of the multicast receivers in turn to avoid contention for the CTS messages from the receivers. The MRTS message could indicate to the receivers that they will have to wait some time before receiving the DS message and the data. I’m sure there are problems with this solution and it wouldn’t work for all multicast situations but it just would have been nice if they had talked about some of the potential solutions and their shortcomings at a little more depth. Overall I liked this paper though because it seems like it probably laid the groundwork for a lot of later wireless protocols and it was easy to read. Their explanations and examples were very nice and clean and they explained the evolution and reasoning behind their design well.