Cybertelecom
Cybertelecom
Federal Internet Law & Policy
An Educational Project

Border Gateway Protocol

Dont be a FOOL; The Law is Not DIY
Internet Addresses
- DNS
- History
- NTIA & Fed Activity
- ICANN
- Root Servers
- ccTLDs
- - .us
- - -.kids.us
- gTLDs
- - .gov
- - .edu
- - .mil
- - .xxx
- WHOIS
- WGIG
- ENUM
- IP Numbers
- - IPv6
- BGP
- NATs
- Ports
- Security
- Trademark
- AntiCybersquatter Consumer Protection Act
- Gripe Sites
- Truth in Domain Names
Telephone Addresses

When networks interconnect, they agree to announce routes to each other utilizing the Border Gateway Protocol (BGP). This is known as "interdomain interconnection." (This is not to be confused with intradomain routing. Routing within a domain or AS network can have its own set of curiosities, it is generally conducted by a different protocol than BGP such as OSPF (open shortest path first) or RIP (resource information protocol) and can involve MPLS creating virtual circuits across domains)

There are two parts to BGP: (a) route announcements by the traffic receiving network and (b) route selection by the traffic sending network.

Announcements

A receiving network announces which destinations (which ASNs) it provides a route to, and how many hops (a.k.a. "AS path length") it takes to get there. [GAO, 2006, p. 7] . If it does not announce routes, then there is no path through that network to that particular destination. The route announcement information does not relay information about capacity or quality of service. The route announcement may include localization information (i.e., MEDS, that the network would prefer to receive traffic destined for New York City at the interconnection point closes to New York City).

Routes that are within the receiving network's domain are OnNet and generally fall under peering. A receiving network can also announce routes to destinations that can be reached through the provider interconnecting with third party networks; these are OffNet and fall under transit.

Route Selection

The sending network listens to announcements and compiles a routing table. The routing table will contain list of known routes, blocks of IP addresses associated with each route, and cost metrics associated with each route. Some information comes from BGP announces; some the sending network adds to the table.

Based on the information in the routing table, the sending network will decide which route to use when sending traffic. The sending network looks in its routing table to see which networks provide a route to, for example, the destination address 192.104.54.5 and how many networks the packets have to go through. Based on that information, the router will select a route to send the packets off to, sending them off to the next hop, which will them do the same look up and make similar decisions, until the packets reach their destination.

* Note that a "Route Flap" can occur when FOO and BAR keep sending the traffic back and forth because their routing tables tell them that the other is the "best route" to the destination ASN 9.

Alternative Routes

If there is a choice of routes (if different networks are announcing routes to a destination), how does a sending network decide which route to utilize? A sending network will select which route to send traffic to based on the following criteria in the following order:

[CISCO, BGP Best Path Selection]

Filtering

The sending network will engage in a certain degree of filtering of possible routes, removing prefixes that for instance your customer does not actually own, configuration mistakes, or routes involved in attacks. Almost every peering policy calls on a peering partner to filter routes. An announcing network will also filter out ASNs that it does not want to announce. An AS receiving route announcements should filter out AS paths that include their own AS number, in order to avoid a route loop. [NIST 800-189 Sec. 2.2]

Local Preference

Where there are alternative paths, there might be good business reasons for selecting one route over another. The sending network might select a customer's route over a free route (after all the customer is paying). The sending network might select a settlement free route over a route where it is the transit-customer. The sending network can assign "local preferences" to different routes so that route selection is made based on this criteria. For instance, the sending network assign values as follows:

Route with the highest score takes the prize. [CISCO, BGP Best Path Selection] [CSRIC Sec. 5.2 ]

AS Path Length

When BAR announces that it has a route to ASN 9 through ASN 8, it is announcing a route and a path length. In this case the path length is 2 (two AS hops). If FOO was directly interconnected with ASN 9, ASN 9 would also be announcing a route to ASN 9 with a path length of 1. Under normal circumstances, FOO will listen to both BGP announcements, compare the path lengths, and send the traffic along the route with the shortest path length. In this case, FOOS would select to send the traffic directly to ASN 9 instead of sending it through BAR.

An announcing network can manipulate AS Path Length by making it appear that a route is longer than it is. An announcing network can prepend ASNs to its announcements to extend the AS Path Length. For example, in the example above, BAR made the announcement "ASN 8 ASN 9" - that it is a two hop route to ASN 9. If it makes the announcement "ASN 8 ASN 8 ASN 9," it now makes it seem like ASN 9 is three hops away, and influences the routing decisions of the sending network. BGP Best Path Selection and Manipulation, CISCO (2014)

NOTE: With the evolution of the Internet ecosystem and CDNs directly connecting to large BIAS providers at IXPs, one would anticipate that AS Path lengths would be shortening. An AS Path would include the large BIAS provider and the CDN if directly connected, or it could be the BIAS provider, an intermediary transit provider, and a CDN if indirectly connected.

Multiple Exit Discriminator (MEDs)

BAR can also announce MEDs. Basically BAR is announcing a localization preference that BAR wants traffic destined for a destination to be delivered near that destination (a.k.a. cold potato routing).

Simply because a receiving network announces MEDs does not mean that the sending network has to honor it. Generally the sending network will honor MEDs when the two networks have an interconnection contract with terms that specify provisions concerning MEDs.

References

History

"NSFNET introduced a complexity into the Internet, which the existing network protocols could not handle. Up to the NSFNET, the Internet consisted basically of the ARPAnet, with client networks stubbed off the ARPAnet backbone. I.e., the hierarchy between so-called Autonomous Systems (AS) was linear, with no loops/meshes, with the Exterior Gateway Protocol (EGP) used for for inter-AS routing carrying the AS Number of the routing neighbor. This made it impossible to detect loops in an environment where two or more separate national backbones with multiple interconnections exist, specifically the ARPAnet and the NSFNET. I defined that I needed an additional "previous" AS Number for the inter-AS routing to allow supporting a meshed Internet with many administrations for its components. Meetings with various constituents did not get us anywhere, and I needed it quickly, rather then creating a multi-year research project. In the end, Yakov Rekhter (IBM/NSFNET) and Kirk Lougheed (Cisco) designed a superset of what I needed on three napkins alongside an IETF meeting that included not just the "previous" AS Number but all previous AS numbers that an IP network number route had encountered since its origin. This protocol was called the Border Gateway Protocol (BGP) and versions of it are in use to this day to hold the Internet together. BGP used the Transmission Control Protocol (TCP) to make itself reliable. Use of TCP as well as general "not invented here" caused great problems with the rest of the Internet community, which we somewhat ignored as we had a pressing need, and soon with NSFNET, Cisco and gated implementations at hand, the Internet community did not have much of a choice. Eventually and after long arguments, BGP got adopted by the IETF." [Braun]

Definitions

Autonomous System

"An AS is a connected group of one or more IP prefixes run by one or more network operators which has a SINGLE and CLEARLY DEFINED routing policy." IETF RFC 1930.

"An Autonomous System (AS) is a group of one or more IP prefixes run by one or more network operators that maintains a single, clearly defined routing policy. An IP prefix is a list of IP addresses that can be reached from that ISP’s network. The network operators must have an ASN to control routing within their networks and to exchange routing information with other ISPs." ARIN

Autonomous System: "A group of routers under a single administration." Service Provider Interconnection for Internet Protocol Best Effort Service, Network Reliability and Interoperability Council V, Focus Group 4: Interoperability, Sec. 1.2.2

Autonomous System Number

"An ASN is a globally unique number used to identify an Autonomous System. An ASN enables an AS to exchange exterior routing information with neighboring ASes." ARIN

"An AS has a globally unique number (sometimes referred to as an ASN, or Autonomous System Number) associated with it; this number is used in both the exchange of exterior routing information (between neighboring ASes), and as an identifier of the AS itself." IETF RFC 1930.

Prefix

In the current classless Internet (see [CIDR]), a block of class A, B, or C networks may be referred to by merely a prefix and a mask, so long as such a block of networks begins and ends on a power-of-two boundary. For example, the networks:

192.168.0.0/24
192.168.1.0/24
192.168.2.0/24
192.168.3.0/24

can be simply referred to as:

192.168.0.0/22

The term"prefix" as it is used here is equivalent to "CIDR block", and in simple terms may be thought of as a group of one or more networks. We use the term "network" to mean classful network, or "A, B, C network". [J. Hawkinson, T. Bates, Guidelines for Creation, Selection, and Registration of an Autonomous System, IETF RFC 1930, Sec. 3 Definitions (March 1996)]

Stub Network

"those networks that only provide connectivity to their end systems." [Kotikalapudi Sriram, Doug Montgomery, Resilient Interdomain Traffic Exchange: BGP Security and DDOS Mitigation, NIST SP 800-189 (Dec. 2019)] A stub network has no customer AS networks.

Hot / Cold Potato Routing

"Hot Potato Routing" is an interconnection policy between peers where one network hands off traffic to another network at the closest exchange point. If both networks follow Hot Potato Routing and if traffic levels are relatively balanced, then each network will relatively equally bare the cost of carrying the traffic. [NRIC Sec. 1.2.2 ("A form of inter-domain routing in which a packet destined for a neighboring ISP is sent via the nearest interconnect to that ISP. ")] [AT&T Ex Parte with Commission Pai, The Internet Interconnection Ecosystem, Slide 13, June 26, 2014. ("A form of inter-domain routing in which a packet destined for a neighboring ISP is sent via the nearest interconnect to that ISP.")]

The history of "Hot Potato Routing" has its routes back to Paul Baran. "Hot Potato Routing" for Baran was not so much a part of an interconnection / settlement scheme as much as a protocol to ensure reliability and resiliency. [Roberts, Computer Science Museum p. 14 1988]

Content Delivery Networks generally engage in "Cold Potato Routing," (a.k.a. "Best Exit Routing") holding onto traffic for as long as possible and handing it off as close to the eyeballs as possible, seeking to manage quality of service and defray the transit costs of the receiving networks. [AT&T Ex Parte with Commission Pai, The Internet Interconnection Ecosystem, Slide 14, June 26, 2014 (diagraming cold potato routing).]

Internal Routing

In order to route traffic internally, networks use

BGP Security

Derived From NIST SP 800-189 at 3

"A BGP prefix hijack occurs when an autonomous system (AS) accidentally or maliciously originates a prefix that it is not authorized (by the prefix owner) to originate. This is also known as false origination (or announcement). In contrast, if an AS is authorized to originate/announce a prefix by the prefix owner, then such a route origination/announcement is called legitimate. In the example illustrated in Figure 1, prefix 192.0.2.0/24 is legitimately originated by AS64500, but AS64510 falsely originates it. The path to the prefix via the false origin AS will be shorter for a subset of the ASes on the internet, and this subset of ASes will install the false route in their routing table or forwarding information base (FIB). That is, ASes for which AS64510 is closer (i.e., shorter AS path length) would choose the false announcement, and thus data traffic from clients in those ASes destined for the network 192.0.2/24 will be misrouted to AS64510.A BGP prefix hijack occurs when an autonomous system (AS) accidentally or maliciously originates a prefix that it is not authorized (by the prefix owner) to originate. This is also known as false origination (or announcement). In contrast, if an AS is authorized to originate/announce a prefix by the prefix owner, then such a route origination/announcement is called legitimate. In the example illustrated in Figure 1, prefix 192.0.2.0/24 is legitimately originated by AS64500, but AS64510 falsely originates it. The path to the prefix via the false origin AS will be shorter for a subset of the ASes on the internet, and this subset of ASes will install the false route in their routing table or forwarding information base (FIB). That is, ASes for which AS64510 is closer (i.e., shorter AS path length) would choose the false announcement, and thus data traffic from clients in those ASes destined for the network 192.0.2/24 will be misrouted to AS64510."

"The rules for IP route selection on the internet always prefer the most specific (i.e., longest) matching entry in a router’s FIB. When an offending AS falsely announces a more-specific prefix (than a prefix announced by an authorized AS), the longer, unauthorized prefix will be widely accepted and used to route data. Figure 1 also illustrates an example of unauthorized origination of unallocated (reserved) address space 240.18.0.0/20. Currently, 240.0.0.0/8 is reserved for future use [IANA-v4-r]. Similarly, an AS may also falsely originate allocated but currently unused address space. This is referred to as prefix squatting, where someone else’s unused prefix is temporarily announced and used to send spam or for some other malicious purpose."

"The various types of unauthorized prefix originations described above are called prefix hijacks or false origin announcements. The unauthorized announcement of a prefix longer than the legitimate announcement is called a sub-prefix hijack. The consequences of such adverse actions can be serious and include denial-of-service, eavesdropping, misdirection to imposter servers (to steal login credentials or inject malware), or defeat of IP reputation systems to launch spam email. There have been numerous incidents involving prefix hijacks in recent years. There are several commercial services and research projects that track and log anomalies in the global BGP routing system [BGPmon] [ThousandEyes] [BGPStream] [ARTEMIS]. Many of these sites provide detailed forensic analyses of observed attack scenarios."

Routing anomalies (aka, "hijacks," "false origin announcements") involve incorrect BGP route announcements

Misconfigured announcements can be of

The announced routing address space may be

Routing anomalies may be:

Routing anomalies may result in [NIST 800-189 Sec. 2.1]

Routing Security:

Prevention of routing anomalies

Routing anomalies:

Barriers to BGP Security

[Fred Baker, Internet Routing with MANRS, MANRS (n.d.), (”Customers trust that their ISPs and IXPs will connect them to those entities with whom they want to communicate. Routing incidents, such as accepting or propagating a false prefix, are a fundamental service failure in that they connect their customers to someone else.”).]

Government Activity

Statistics | Assessment | Forensics

Network interconnection arrangements are announced through BGP. Organizations that listen to these announcements can develop a relatively accurate picture of who interconnects with whom and whether the arrangement is transit or peering. Because these are routing announcements, the organizations can detect what routes are announced, but not the financial terms of the arrangements. See also Internet eXchange Points, Backbones.

Organizations

Papers

News