Tuesday, October 27, 2009

On Inferring Autonomous System Relationships in the Internet


L. Gao, "On Inferring Autonomous System Relationships in the Internet," IEEE/ACM Transactions on Networks, V. 9, N. 6, (December 2001), pp. 733-745.


One line summary: This paper provides a heuristic for classifying relationships between autonomous systems in the internet into provider-customer, peering, and sibling relationships and evaluates the inferences of the heuristics using AT&T internal data as well as the WHOIS lookup service.

Summary

This paper proposes a method for classifying relationships between Autonomous Systems (ASs) into provider-customer, peering, and sibling relationships. These relationships are based on routing and commercial contractual relationships between administrative entities. The classification is tested using BGP routing table data collected via RouteViews. It is verified using AT&T internal information about its ASs and their relationship with neighboring ASs, as well as by using the WHOIS lookup service. The paper first reviews BGP routing and policies. BGP speaking routers in an AS enforce import policies to determine which paths to include and use in their routing tables, and export policies to determine which paths to advertise to neighbors. The relationships between an AS and its neighbors determine what export policies the AS enforces. In general, ASs follow the selective export rule, which basically states that an AS will not export its provider or peer routes to its providers or peers and it will export customer, provider, and peer routes with its siblings and customers. This rule leads to the property that the AS path of a BGP routing table entry will be valley free. For a valley free AS path there is an uphill portion of the path that consists only of customer-to-provider or sibling-to-sibling edges and a downhill path that consists only of provider-to-customer or sibling-to-sibling links. The uphill and downhill paths may be connected by a peer-to-peer link. From these definitions it is possible to define maximal uphill and maximal downhill paths. In addition, the uphill top provider is defined as the AS with the highest degree, or largest size, among all the ASs in the maximal uphill path, and the downhill top provider is defined analogously. The overall top provider is the AS with the highest degree among the uphill and downhill top providers.


Given these considerations, the basic heuristic of used to classify edges between ASs is as follows: it looks through the AS path of each BGP routing table entry. It defines the highest degree AS as the top provider. The consecutive AS pairs before the top provider are then either customer-to-provider links or sibling-to-sibling links, and those after the top provider are either provider-to-customer links or sibling-to-sibling links. A refined version of the heuristic attempts to reduce the possibility of incorrect inference from routing misconfigurations by only concluding a transit relationship exists if a certain number of routing table entries imply that. The final version of the algorithm attempts to identify peering relationships by identifying potential peers based on similarity of size (i.e. it assumes the degree of two ASs that have a peering relationship do not differ by more than a certain factor, R, that must be fine-tuned) and concludes that two ASs have a peering relationship if neither of the two provides transit for the other.


The author then compares the inferences of the algorithms with AT&T internal information. The algorithm tends to correctly infer customer relationships correctly most of the time and peering relationships a large part of the time, but tends to incorrectly infer sibling relationships. The author gives several reasons for incorrectly inferring sibling relationships, including router misconfigurations, unusual AS relationships, and heuristic inaccuracy.


Critique

I didn’t like this paper as much as the last one. I thought the author did a nice job of laying out the reasoning behind inferring the AS relationships but it could have been explained in fewer words; I also thought too much space was spent on reviewing BGP routing and policies. The heuristics didn’t do very well and they seemed pretty basic, so work that improves upon these methods would be interesting to see. I question the heuristics’ reliance on the degree or size of an AS as a way of classifying relationships, for instance, in determining peering relationships. And related to this, as the author says, the necessity of a parameter like R for determining how different in size two ASs have to be to be disqualified from consideration as peers is unfortunate. I am curious as to why the heuristic did so poorly in classifying sibling relationships but not the others. Although reasons were given they weren’t entirely satisfying and they could certainly be explored in more depth.


No comments:

Post a Comment