What is BGP
Border Gateway Protocol (BGP) is regards as the most influential network protocols as it is backbone of the internet today. BGP is a Path Vector Routing Protocol, that unlike other routing protocols uses TCP (port 179, as its transport layer) to establish connectivity before exchanging routing information with another BGP speaker (peer). BGP communication can be done between same and/or different networks, these networks are known as Autonomous Systems (AS) with an AS being a set of Routers that are managed by single entity, business and/or company. BGP uses routing information to maintain a BGP Routing Information Base (RIB) of Network Layer Reachability Information (NLRI) which it will exchange with other BGP peer or Peer ASs. BGP is a classless protocol, it can support any IP prefix regardless of class, this is both for IPv4 and IPv6. It is important to note that requires TCP connection first before building BGP connection, without that first established session a BGP peering never happen, however once that session is connected it will not have to made again unless a change is made. BGP uses Keepalive messages to ensure reliability of the session as it does not use any transport protocol-based keep-alive mechanism to determine if peers are reachable.
BGP is largely (but not exclusively) used in large enterprises and data centre hosting environments where the need for single or multihomed to multiple Internet Service Providers (ISPs) connections are needed, this is known as Exterior BGP (eBGP). BGP is extensively used with Service Provider environments. BGP allows a large range of the policy based controls for an AS to influence and/or manipulate routed inbound and outbound traffic to help optimise the movement of traffic for their own needs. Additionally BGP can be used between BGP routers within the same AS to advertise internal routes with the same level of control as eBGP, with some small however important difference, this is known as Interior BGP (iBGP).
eBGP vs iBGP
There are some Key Differences between eBGP and iBGP that are important to note:
- eBGP session is between BGP peers with different AS numbers
- Inter-AS communication is by via eBGP
- eBGP respects the AS_Path Path Attribute
- Routes learnt via eBGP will be advertised to other eBGP and iBGP peers
- iBGP session is between BGP peers with the same AS number
- Intra-AS communication can be by via iBGP
- iBGP commonly uses an IGP for network reachability and to establish BGP TCP session via Loopback address
- Routes learnt via iBGP will not be advertised to other iBGP peers however will advertise routes to an eBGP peer
The above isn’t the full differences but just some of the main difference that need to remember. Additionally there are situations where some of these rules may need to be manipulated and can be done in design and/or configuration however that is for later
BGP Peering States
When establishing a BGP session there are 6 states that need to be completed before peering session comes up. The first 3 states are to ensure the TCP transport layer connectivity is there, once this has been completed then BGP connectivity is established with the final 3 states:
|Idle||TCP||This is when all BGP connections will be refused. An Idle state occurs when the BGP session hasn’t been configured on the other BGP peer or BGP has isn’t enabled at all. Commonly, a start event is required from the other peer to prepare the TCP connectivity.|
|Connect||TCP||The router listening for TCP connections and is waiting for the TCP 3-way handshake to be completed:
|Active||TCP||This is when the BGP peer is trying to establish TCP connection.|
|OpenSent||BGP||When in the OpenSent, an open message has been sent by the BGP peer however has not received by the local peer:
|OpenConfirm||BGP||When in the OpenConfirm, the BGP is waiting on a Keepalive or Notification message:
|Established||BGP||Having received the Keepalive message, the BGP session is fully Established. The peers are now able to exchange Update, Notification and Keepalive messages|
BGP Message Types
As shown above, there are a number of different messages sent between BGP peers to Establish a session and when even the peering has been established, messages are used to ensure that both peers have synchronized routing information. BGP can only process a message after the entire message has been received, the maximum message size is 4096 bytes with 19 bytes being the smallest message size, this would be just be a header with no data. Each message type uses a fixed header size of 19 bytes with BGP Keepalives not include any data after the header, so they will always use the minimum size.
Each Message would be include the following:
|Open||Once TCP connection has been completed both peers will send out an Open Message. This message starts the peering session, it provides details about the remote peer, in addition to details about supported and optional options.
These details are included:
|Update||An Update Message sends a list of new, withdrawn or types of routes from the remote peer. Depending on the routing policy of remote peer these may or not be entered into the Routing Table.
These details are included:
|Keepalive||It is important to always remember that Keepalive Messages are not used to ensure the TCP connection between peers is kept. They are used to ensure that BGP Hold Timers do not expire keeping alive the route exchange.|
|Notification||Notification Message is used to inform a peer that there is an error with the BGP session.
There are 6 Error code numbers:
In addition to 17 Sub Error codes (6 Open Message Errors and 11 Update Message Errors). These can found in RFC4271
|Refresh||Normally BGP can not readvertise routes that have already been acknowledged by a peer, if the BGP peer has been configured to soft clear of BGP sessions then peers will be able to exchange Refresh Messages. Some vendors you have to explicitly configure this, in Cisco you need to configure soft-reconfiguration whereas with Juniper it is set by default within JunOS.|
Unlike other Routing Protocols, BGP primary function is to find the best path to a destination and not the shortest path. BGP uses a number of attributes to calculate the best path for any given destination prefix. These attributes can be broken down into 4 types:
|Well Known Attribute Types|
|Well known Mandatory||These attributes must be known and understood by all BGP speakers. Additionally must exist within the BGP update messages.
Attributes classed as Well Known Attributes:
|Well known Optional||These attributes must be known and understood by all BGP speakers. However they don’t have to exist within a BGP update message.
Attributes classed as Well Known Optional Attributes:
|Optional BGP Attribute Types|
|Optional Transitive||Attributes don’t need to be understood by a BGP speaker however the set flag(s) will need to be passed onto other neighbours.
Attributes classed as Optional Transitive:
|Optional Non-Transitive||These attributes don’t need to be understood by a BGP speaker and the set flag(s) will not be passed onto other neighbours.
A BGP Update message could include some, if not all, of the following attributes:
|Origin (Attribute Code 1)||The Origin Attribute confirms the source of the route aka where the route was learnt from. The Origin of a route can either be:
The rule used for Origin is that: Internal is better than External which is better than Incomplete
|AS Path (Attribute Code 2)||AS Path is a list of AS numbers that are between the source AS router to the our own AS. The AS Path is primary usages are to prevent Routing Loops, assist in the Path Selection and Policy Based Routing (PBR). BGP router will drop any routes received where it can see its own AS number within the AS Path this is how Routing Loops are prevented. The path enables the router to make policy decisions based on the presence of certain AS’s within the path. Additionally routes with a shorter AS Path are preferred over routes with longer AS Path|
|Next-Hop (Attribute Code 3)||This Attribute contains the IP address of the BGP peer that advertises the route. The Next-Hop is used for reachability and reliable of for the BGP session. For eBGP it is usually the peering address associated with the physical link with another AS. iBGP works differently as you can have situations where due to rules with iBGP the next-hop address isn’t reachable due to learning the route from another iBGP peer, in this situation the Next-Hop can be changed by policy.|
|Multi Exit Discriminator (Attribute Code 4)||Multi Exit Discriminator (MED) is used when there are more than one route to the same upstream AS. The route with the lowest MED value is always preferred by default.|
|Local Preference (Attribute Code 5)||Local Preference is an important attribute as it is the first attribute evaluated in the Path Selection Process. Local Preference is used for Infra-AS traffic communications for BGP session. As the name, suggests is only used to influence traffic within an AS. Oddly BGP prefers routes with the Highest Local Preference.|
|Atomic Aggregator (Attribute Code 6)||Atomic Aggregator attribute is a notification that tells other BGP speakers within the AS-Path that some information has been lost and/or changed due to route aggregation. This may affect the best path selection because a less specific route was selected over more specific route.|
|Aggregator (Attribute Code 7)||Aggregator attribute is set when an advertised route has been aggregated. This attribute contains the AS number and Router-ID of the Router that has performed the aggregation|
|Communities (Attribute Code 8)||Community attribute is tag that is use to modify, filter and/or influence a common group of IP Prefix(es) to act in a user defined way. Communities uses 4-octets of space to represent its value. Communities are used in conjunction with PBR. A community is 32-bit value, that is common defined as AS/IP-address:User-defined ie 100:1 or 192.168.100.1:1. 100 would be the AS or 192.168.100.1 being the device loopback address with 1 being a value significant within AS100.|
|Originator ID (Attribute Code 9)||Originator attribute is a loop prevention mechanism used within iBGP network using a Route Reflector. The Route Reflector attaches if own Router-ID to routes, so if it receives a route with its own Router-ID it will ignore the route.|
|Cluster List (Attribute Code 10)||Cluster List similar to the Originator ID attribute is a loop prevention mechanism however if an iBGP network is used clustered set of Route Reflectors then routes have the Route Reflectors Cluster ID attached to the advertised routes.|
|Multi-Protocol Reachable NLRI (Attribute Code 14)||Multi-Protocol Reachable NLRI has two main functions as defined in RFC 4760:
When this attribute is used in a BGP Update message, the Origin and AS Path attributes have to be included. Local Preference attribute is additionally added to Update messages for iBGP peering sessions.
|Multi-Protocol Unreachable NLRI (Attribute Code 15)||Multi-Protocol Unreachable NLRI attribute is used to withdraw any BGP families that are no longer being advertised between BGP peers.|
|Extended Communities (Attribute Code 16)||Extended Communities are the same as Community attribute however it has 8 octets of space to represent the community compared to 4 octets with normal communities. This allows 64-bit value, it can be represented as Type:Global-Administrator:Local-Administrator. It is important to note that you have set amount of bits you can use. You will have 16 bits for the Type, 16 bits, for the Global-Administrator (commonly the ASN/IP address) and 32 bits, for the Local-Administrator (commonly user defined).|
BGP Path Selection
When a destination prefix reached by multiple routes via BGP by default only one path will be advertised into the Routing Table. With this in mind BGP has used its Route Selection Algorithm to determine what path will be installed into the Routing Table. The algorithm uses the following steps:
- Prefer the highest Local Preference Value
- Checks what path has shortest AS Path
- The Route with the Lowest Origin Value
- If the route has a Lower MED
- If the Prefix is learnt via eBGP is preferred over being learnt via iBGP
- The path with the better exit out of the local AS. This means that the underlying IGP metric cost is taken into consideration, the path with the lowest IGP is preferred
- The eBGP route that has the longest uptime or prefer the routes from the peer with lowest Router ID
- Prefer routes with the shortest Cluster List Length. This is when you use a Route Reflector within your iBGP peering session
- Prefer routes from a peer with the lowest IP Address
Some vendors have their own vendor specific additions to the path selection algorithm. Cisco use Weight before checking Local Preference and Juniper verify that the Next-Hop is reachable before checking Local Preference. With JunOS, if the Next-Hop isn’t verified then the route is set as a Hidden route and will need investigating.