Tag Archives: routing

Border Gateway Protocol (BGP)

Reading Time: 9 minutes

BGP Basics

What is BGP

Border Gateway Protocol (BGP) is regards as the most influential network protocols as it is backbone of the internet today. BGP is a Path Vector Routing Protocol, that unlike other routing protocols uses TCP (port 179, as its transport layer) to establish connectivity before exchanging routing information with another BGP speaker (peer). BGP communication can be done between same and/or different networks, these networks are known as Autonomous Systems (AS) with an  AS being a set of Routers that are managed by single entity, business and/or company. BGP uses routing information to maintain a BGP Routing Information Base (RIB) of Network Layer Reachability Information (NLRI) which it will exchange with other BGP peer or Peer ASs. BGP is a classless protocol, it can support any IP prefix regardless of class, this is both for IPv4 and IPv6. It is important to note that requires TCP connection first before building BGP connection, without that first established session a BGP peering never happen, however once that session is connected it will not have to made again unless a change is made. BGP uses Keepalive messages to ensure reliability of the session as it does not use any transport protocol-based keep-alive mechanism to determine if peers are reachable.

BGP Usage

BGP is largely (but not exclusively) used in large enterprises and data centre hosting environments where the need for single or multihomed to multiple Internet Service Providers (ISPs) connections are needed, this is known as Exterior BGP (eBGP). BGP is extensively used with Service Provider environments. BGP allows a large range of the policy based controls for an AS to influence and/or manipulate routed inbound and outbound traffic to help optimise the movement of traffic for their own needs. Additionally BGP can be used between BGP routers within the same AS to advertise internal routes with the same level of control as eBGP, with some small however important difference, this is known as Interior BGP (iBGP).

eBGP vs iBGP

There are some Key Differences between eBGP and iBGP that are important to note:

eBGP

  • eBGP session is between BGP peers with different AS numbers
  • Inter-AS communication is by via eBGP
  • eBGP respects the AS_Path Path Attribute
  • Routes learnt via eBGP will be advertised to other eBGP and iBGP peers

iBGP

  • iBGP session is between BGP peers with the same AS number
  • Intra-AS communication can be by via iBGP
  • iBGP commonly uses an IGP for network reachability and to establish BGP TCP session via Loopback address
  • Routes learnt via iBGP will not be advertised to other iBGP peers however will advertise routes to an eBGP peer

The above isn’t the full differences but just some of the main difference that need to remember. Additionally there are situations where some of these rules may need to be manipulated and can be done in design and/or configuration however that is for later

BGP Peering States

When establishing a BGP session there are 6 states that need to be completed before peering session comes up. The first 3 states are to ensure the TCP transport layer connectivity is there, once this has been completed then BGP connectivity is established with the final 3 states:

BGP State Connect Description
Idle TCP This is when all BGP connections will be refused. An Idle state occurs when the BGP session hasn’t been configured on the other BGP peer or BGP has isn’t enabled at all. Commonly, a start event is required from the other peer to prepare the TCP connectivity.
Connect TCP The router listening for TCP connections and is waiting for the TCP 3-way handshake to be completed:

  • If this is completed, then an Open message is sent and is transitioned into the OpenSent State.
  • If TCP connections fails, the BGP peer restarts the ConnectRetryTimer and waits for the remote peer to initiate a TCP connection and transitions into an Active State.
Active TCP This is when the BGP peer is trying to establish TCP connection.
OpenSent BGP When in the OpenSent, an open message has been sent by the BGP peer however has not received by the local peer:

  • Once the message has been received, checked and has no errors, the local peer will send a Keepalive message.
  • If a message is received, checked and an error is found then state is transitioned back to an Idle state.
OpenConfirm BGP When in the OpenConfirm, the BGP is waiting on a Keepalive or Notification message:

  • Once the peer receives a Keepalive message, it will move into the Established state
  • If the local peer does not receive a Keepalive within the negotiated session hold timer, it will send a Notification message and transition back to Idle state. The same will occur if the local peer sends out a Notification message.
Established BGP Having received the Keepalive message, the BGP session is fully Established. The peers are now able to exchange Update, Notification and Keepalive messages

BGP Message Types

As shown above, there are a number of different messages sent between BGP peers to Establish a session and when even the peering has been established, messages are used to ensure that both peers have synchronized routing information. BGP can only process a message after the entire message has been received, the maximum message size is 4096 bytes with 19 bytes being the smallest message size, this would be just be a header with no data. Each message type uses a fixed header size of 19 bytes with BGP Keepalives not include any data after the header, so they will always use the minimum size.

Each Message would be include the following:

Message Type Description
Open Once TCP connection has been completed both peers will send out an Open Message. This message starts the peering session, it provides details about the remote peer, in addition to details about supported and optional options.

These details are included:

  1. BGP version (normally version 4)
  2. AS number
  3. Hold Time
  4. Router ID
  5. Par-Len
    • If this set, it informs the peer that optional parameters should be expected
  6. Optional Parameters
    • This is where negotiable parameters are indicated, these would be authentication and capability extension such as Multiprotocol Extensions and route refresh.
Update An Update Message sends a list of new, withdrawn or types of routes from the remote peer. Depending on the routing policy of remote peer these may or not be entered into the Routing Table.

These details are included:

  1. Unfeasible Route Length
    • If this is set, it will tell the peer, the length of withdraw routes
  2. Withdrawn Routes
    • Lists IP prefixes that have been removed as they are no longer deemed as reachable
  3. Path Attribute Length
    • This indicates the total length of Path Attributes field. Its value allows the length of the Network Layer Reachability field to be determined. A value of 0 determines that neither Path Attribute and NLRI is present in the update.
  4. Path Attribute
    • The following properties for a route is included:
      1. Origin
      2. AS Path
      3. Next-Hop
      4. Multi-Exit Discriminator (MED)
      5. Local Preference
  5. Network Length Reachability Information (NLRI)
    • Lists IP prefixes that will be advertised as reachable via the AS
Keepalive It is important to always remember that Keepalive Messages are not used to ensure the TCP connection between peers is kept. They are used to ensure that BGP Hold Timers do not expire keeping alive the route exchange.
Notification Notification Message is used to inform a peer that there is an error with the BGP session.

There are 6 Error code numbers:

  1. Message Header Error
  2. Open Message Error
  3. Update Message Error
  4. Hold Timer Expired
  5. Finite State Machine Error
  6. Cease

In addition to 17 Sub Error codes (6 Open Message Errors and 11 Update Message Errors). These can found in RFC4271

Refresh Normally BGP can not readvertise routes that have already been acknowledged by a peer, if the BGP peer has been configured to soft clear of BGP sessions then peers will be able to exchange Refresh Messages. Some vendors you have to explicitly configure this, in Cisco you need to configure soft-reconfiguration whereas with Juniper it is set by default within JunOS.

BGP Attributes

Unlike other Routing Protocols, BGP primary function is to find the best path to a destination and not the shortest path. BGP uses a number of attributes to calculate the best path for any given destination prefix. These attributes can be broken down into 4 types:

Well Known Attribute Types
Well known Mandatory These attributes must be known and understood by all BGP speakers. Additionally must exist within the BGP update messages.

Attributes classed as Well Known Attributes:

  • Origin
  • AS Path
  • Next-Hop
Well known Optional These attributes must be known and understood by all BGP speakers. However they don’t have to exist within a BGP update message.

Attributes classed as Well Known Optional Attributes:

  • Local Preference
  • Atomic Aggregator
Optional BGP Attribute Types
Optional Transitive Attributes don’t need to be understood by a BGP speaker however the set flag(s) will need to be passed onto other neighbours.

Attributes classed as Optional Transitive:

  • Aggregator
  • Community
  • Extended Community
Optional Non-Transitive These attributes don’t need to be understood by a BGP speaker and the set flag(s) will not be passed onto other neighbours.

  • Multi Exit Discriminator
  • Originator ID
  • Cluster List
  • Multiprotocol Reachable NLRI
  • Multiprotocol Unreachable NLRI

A BGP Update message could include some, if not all, of the following attributes:

 
Message Information
 Origin (Attribute Code 1) The Origin Attribute confirms the source of the route aka where the route was learnt from. The Origin of a route can either be:

  1. I: Internal (0) The Route is learnt from IGP

  2. E: External (1) The Route is learnt from EGP

  3. ?: Incomplete (2) The Route is learnt by something that isn’t by Internal or External methods

The rule used for Origin is that: Internal is better than External which is better than Incomplete

 AS Path (Attribute Code 2) AS Path is a list of AS numbers that are between the source AS router to the our own AS. The AS Path is primary usages are to prevent Routing Loops, assist in the Path Selection and Policy Based Routing (PBR). BGP router will drop any routes received where it can see its own AS number within the AS Path this is how Routing Loops are prevented. The path enables the router to make policy decisions based on the presence of certain AS’s within the path. Additionally routes with a shorter AS Path are preferred over routes with longer AS Path
Next-Hop (Attribute Code 3) This Attribute contains the IP address of the BGP peer that advertises the route. The Next-Hop is used for reachability and reliable of for the BGP session. For eBGP it is usually the peering address associated with the physical link with another AS. iBGP works differently as you can have situations where due to rules with iBGP the next-hop address isn’t reachable due to learning the route from another iBGP peer, in this situation the Next-Hop can be changed by policy.
Multi Exit Discriminator (Attribute Code 4) Multi Exit Discriminator (MED) is used when there are more than one route to the same upstream AS. The route with the lowest MED value is always preferred by default.
Local Preference (Attribute Code 5) Local Preference is an important attribute as it is the first attribute evaluated in the Path Selection Process. Local Preference is used for Infra-AS traffic communications for BGP session. As the name, suggests is only used to influence traffic within an AS. Oddly BGP prefers routes with the Highest Local Preference.
Atomic Aggregator (Attribute Code 6) Atomic Aggregator attribute is a notification that tells other BGP speakers within the AS-Path that some information has been lost and/or changed due to route aggregation. This may affect the best path selection because a less specific route was selected over more specific route.  
Aggregator (Attribute Code 7) Aggregator attribute is set when an advertised route has been aggregated. This attribute contains the AS number and Router-ID of the Router that has performed the aggregation
Communities (Attribute Code 8) Community attribute is tag that is use to modify, filter and/or influence a common group of IP Prefix(es) to act in a user defined way. Communities uses 4-octets of space to represent its value. Communities are used in conjunction with PBR. A community is 32-bit value, that is common defined as AS/IP-address:User-defined ie 100:1 or 192.168.100.1:1. 100 would be the AS or 192.168.100.1 being the device loopback address with 1 being a value significant within AS100.
Originator ID (Attribute Code 9) Originator attribute is a loop prevention mechanism used within iBGP network using a Route Reflector. The Route Reflector attaches if own Router-ID to routes, so if it receives a route with its own Router-ID it will ignore the route.
Cluster List (Attribute Code 10) Cluster List similar to the Originator ID attribute is a loop prevention mechanism however if an iBGP network is used clustered set of Route Reflectors then routes have the Route Reflectors Cluster ID attached to the advertised routes.
Multi-Protocol Reachable NLRI (Attribute Code 14) Multi-Protocol Reachable NLRI has two main functions as defined in RFC 4760:

  1. Negotiates what non IPv4 unicast families will be announced between two BGP peers.

  2. Defines the Network Layer Address of the router that should be the next-hop of the destination families. Ie if you have advertised l2vpn bgp family the next-hop for this bgp family will be defined within this attribute.

When this attribute is used in a BGP Update message, the Origin and AS Path attributes have to be included. Local Preference attribute is additionally added to Update messages for iBGP peering sessions.

Multi-Protocol Unreachable NLRI (Attribute Code 15) Multi-Protocol Unreachable NLRI attribute is used to withdraw any BGP families that are no longer being advertised between BGP peers.
Extended Communities (Attribute Code 16) Extended Communities are the same as Community attribute however it has 8 octets of space to represent the community compared to 4 octets with normal communities. This allows 64-bit value, it can be represented as Type:Global-Administrator:Local-Administrator. It is important to note that you have set amount of bits you can use. You will have 16 bits for the Type, 16 bits, for the Global-Administrator (commonly the ASN/IP address) and 32 bits, for the Local-Administrator (commonly user defined).   

BGP Path Selection

When a destination prefix reached by multiple routes via BGP by default only one path will be advertised into the Routing Table. With this in mind BGP has used its Route Selection Algorithm to determine what path will be installed into the Routing Table. The algorithm uses the following steps:

  • Prefer the highest Local Preference Value
  • Checks what path has shortest AS Path
  • The Route with the Lowest Origin Value
  • If the route has a Lower MED
  • If the Prefix is learnt via eBGP is preferred over being learnt via iBGP
  • The path with the better exit out of the local AS. This means that the underlying IGP metric cost is taken into consideration, the path with the lowest IGP is preferred
  • The eBGP route that has the longest uptime or prefer the routes from the peer with lowest Router ID
  • Prefer routes with the shortest Cluster List Length. This is when you use a Route Reflector within your iBGP peering session
  • Prefer routes from a peer with the lowest IP Address

Some vendors have their own vendor specific additions to the path selection algorithm. Cisco use Weight before checking Local Preference and Juniper verify that the Next-Hop is reachable before checking Local Preference. With JunOS, if the Next-Hop isn’t verified then the route is set as a Hidden route and will need investigating.

Share this:
Share

Layer-2 VPNs on Junos

Reading Time: 7 minutes

It has been a busy few weeks trying to stay ahead of all the new work that has been coming towards myself and the team, due to the in sourcing of the core network! Lucky enough for my team, we have finally got our hands onto full end-to-end connectivity! Fun times 😀

With that being said, I’ve been given a wee project to provision a circuit for a business customer between two sites for a Proof Of Concept. As this circuit is being using as a POC (for now), it was agreed that a Layer 2 VPN (L2VPN/pseudowire) will be best suited, because a simple point-to-point connection was needed between two PEs. As we have a MPLS enabled network, it was decided that would be the easiest way to get their POC up and running quickly, as we were under a bit of a hard deadline!

For me, it was good little project, even though I know what L2VPNs were and how they work, I had never configured one myself. You see where I’m going with this now?

This post will over note how to configure L2VPN with Junos 😀

L2VPN, also known as a pseudowire, is defined in RFC4665, where they are called Virtual Private Wire Service (VPWS):

The PE devices provide a logical interconnect such that a pair of CE devices appears to be connected by a single logical Layer 2 circuit. PE devices act as Layer 2 circuit switches. Layer 2 circuits are then mapped onto tunnels in the SP network. These tunnels can either be specific to a particular VPWS, or be shared among several services. VPWS applies for all services, including Ethernet, ATM, Frame Relay, etc. Each PE device is responsible for allocating customer Layer 2 frames to the appropriate VPWS and for proper forwarding to the intended destinations.

In essence, L2VPNs are virtual point-to-point circuit that use the underlying Transport Labels (LDP/RSVP) or a statically defined MPLS path to go between two PE’s, that allows the extension of a layer 2 broadcast domain. If you need multiple sites on the same layer 2 broadcast you will need to consider Virtual Private Lan Service (VPLS) or Ethernet VPN (EVPN).

Within Junos there are 3 ways of configuring L2VPNs, two are regarded as modern way and has been rectified with RFC’s with an additional legacy method. Kompella and Martini are regarded as the industry standard, with Circuit Cross-Connect (CCC) seen as legacy:

  • Circuit Cross-Connect: The Circuit Cross-Connect style of L2VPN uses a single Outer Label, also known as the Tunnel/Transport Label, to transport L2 payload from PE to PE. CCC can ONLY use RSVP as MPLS transport, in addition each CCC connection has its own dedicated RSVP-signalled LSP associated, the transport label cannot be shared between multiple connections. LSPs are manually created on each PE to determines which circuit the frame belongs to on the other end.
  • Martini: The Martini style of L2VPN has a pair of labels before the L2 frame. The Outer label is the transport mechanism that allows the frame from egress interface from the sending PE to ingress interface of the receiving PE. The Inner label, known as the VC Label, is the label that informs the receiving PE, where the L2VPN payload should go. It is important to note that if you are using the Martini style, although either LDP or RVSP can be used MPLS transport, that LDP is used for the signalling of the VC label. So if the RSVP is used as the MPLS transport, LDP will need to be enabled on the loopback address of both PE routers. A minimum of 2 LSPs will need to be set, as MPLS LSPs are unidirectional.
  • Kompella: The Kompella style of L2VPN is similar to Martini style as both use stacked labels before the Layer 2 payload and both can use LDP, RSVP or both as Transport Label. There difference comes in that unlike Martini, Kompella uses BGP signalling as its VC Label. This means you will need to have BGP enabled network, in addition, it’s not compulsory to send static LSPs as BGP provides a mechanism for autodiscovery of new point-to-point links similar to a VPLS. Although Kompella has a more complex configuration, because of its usage of BGP signalling it is regarded as the best option for large scale deployments as it will in-conjunction with other BGP families. RFC6624 has more details on L2VPN using BGP for Auto-Discovery and Signaling

In our network, we use the Kompella style of L2VPNs. The bulk and most depth of my testing was with that method… Although I was able to get a wee bit of naughty time after to configure the other methods 🙂

The topology I’ll be working with is a simple one. I’ve a got a single MX480 broken up into 3 Logical Systems.

L2VPN Topology


The underlying IGP is IS-IS with RSVP, LDP and BGP enabled. This is a mirror, of what we have in production. With all the L2VPNs the customer facing physical interface has to be set to the correct encapsulation. For my testing, as I wont be using VLANs, Bridging or Setting a VPLS. I used ethernet-ccc and had set the logical interface to family ccc, you can find out more about the different physical encapsulations here

Interface ConfigRSVPMPLSBGPIS-ISLDP
set interfaces xe-0/1/0 enable
set interfaces xe-0/1/0 encapsulation ethernet-ccc
set interfaces xe-0/1/0 unit 0 family ccc
set protocols rsvp interface xe-1/0/0.0
set protocols rsvp interface xe-1/0/2.0
set protocols mpls explicit-null
set protocols mpls ipv6-tunneling
set protocols mpls no-decrement-ttl
set protocols mpls interface xe-1/0/0.0
set protocols mpls interface xe-1/0/2.0
set protocols bgp group Master type internal
set protocols bgp group Master local-address 192.168.2.1
set protocols bgp group Master family inet unicast
set protocols bgp group Master family inet6 unicast
set protocols bgp group Master local-as 100
set protocols bgp group Master neighbor 192.168.2.2 
set protocols bgp group Master neighbor 192.168.2.3
set protocols isis reference-bandwidth 1000g
set protocols isis level 1 disable
set protocols isis level 2 wide-metrics-only
set protocols isis interface xe-1/0/0.0 ldp-synchronization
set protocols isis interface xe-1/0/0.0 point-to-point
set protocols isis interface xe-1/0/0.0 link-protection
set protocols isis interface xe-1/0/2.0 ldp-synchronization
set protocols isis interface xe-1/0/2.0 point-to-point
set protocols isis interface xe-1/0/2.0 link-protection
set protocols isis interface xe-1/0/3.0 ldp-synchronization
set protocols isis interface xe-1/0/3.0 point-to-point
set protocols isis interface xe-1/0/3.0 link-protection
set protocols isis interface lo0.0
sset protocols ldp track-igp-metric
set protocols ldp explicit-null
set protocols ldp transport-address router-id
set protocols ldp interface xe-1/0/0.0
set protocols ldp interface xe-1/0/2.0
set protocols ldp interface lo0.0

All configurations will be done on the Master and SiteA, and for my examples I will show work done on the Master Instance. With all that out of the way… Let’s get cracking 😀

Kompella

As stated before, BGP is used as the VPN signalling method, with that in mind, we will need to enable layer-2 signalling within MP-BGP. This is simply done by adding the command family l2vpn signaling with the BGP stanza. This can be added globally within BGP or under the specific neighbour.

set protocols bgp group Master family l2vpn signaling

With the signalling sorted we can go straight into the configuration of the L2VPN. Just like L3VPNs, L2VPNs configuration is done within the routing-instance stanza and uses the same parameters as L3VPN by having Route Distinguisher (RD) and Route-Target/vrf-target (RT). The RD has to be unique per device with RT matching on all devices within the L2VPN, this is important, so that traffic can be routed accordingly per site. In addition, routing-instance has to be set to l2vpn and the interface(s) have to be defined within the routing-instance as well.

set routing-instances Master instance-type l2vpn
set routing-instances Master interface xe-0/1/0.0
set routing-instances Master route-distinguisher 100:0001
set routing-instances Master vrf-target target:100:0000

Next the properties for that site within the L2VPN will need to configured under protocol l2vpn within the routing-instance. The encapsulation has to match all site that want to participate within the VPN. The Site identifier must be unique to the entire site within the L2VPN as the site ID is used to compute label values for site-to-site communications. The interface(s) have to be defined within l2vpn and l2vpn site stanzas.

set routing-instances Master protocols l2vpn encapsulation-type ethernet
set routing-instances Master protocols l2vpn interface xe-0/1/0.0
set routing-instances Master protocols l2vpn site Master site-identifier 1
set routing-instances Master protocols l2vpn site Master interface xe-0/1/0.0
Full Kompella Configuration
set routing-instances Master instance-type l2vpn
set routing-instances Master interface xe-0/1/0.0
set routing-instances Master route-distinguisher 100:0001
set routing-instances Master vrf-target target:100:0000
set routing-instances Master protocols l2vpn encapsulation-type ethernet
set routing-instances Master protocols l2vpn interface xe-0/1/0.0
set routing-instances Master protocols l2vpn site Master site-identifier 1
set routing-instances Master protocols l2vpn site Master interface xe-0/1/0.0
set protocols bgp group Master family l2vpn signaling

Verification

The primary command that will be used to check the status of a pseudowire would be show l2vpn connections. As Komplella signalling uses BGP, we will be able to do a show bgp summary and see a route being advertised within the l2vpn and routing instance tables show route table Master.l2vpn.0 or show route table bgp.l2vpn.0 respectfully. Additionally we will be able to mpls.0 table to confirm that the L2VPN incoming label and interface(s) for the pseudowire have made the routing table, by using show route table mpls.0.

Show l2vpn Connectionsshow bgp summaryshow route table Master.l2vpn.0show route table mpls.0
[email protected]> show l2vpn connections    
Layer-2 VPN connections:

Legend for connection status (St)   
EI -- encapsulation invalid      NC -- interface encapsulation not CCC/TCC/VPLS
EM -- encapsulation mismatch     WE -- interface and instance encaps not same
VC-Dn -- Virtual circuit down    NP -- interface hardware not present 
CM -- control-word mismatch      -> -- only outbound connection is up
CN -- circuit not provisioned    <- -- only inbound connection is up
OR -- out of range               Up -- operational
OL -- no outgoing label          Dn -- down                      
LD -- local site signaled down   CF -- call admission control failure      
RD -- remote site signaled down  SC -- local and remote site ID collision
LN -- local site not designated  LM -- local site ID not minimum designated
RN -- remote site not designated RM -- remote site ID not minimum designated
XX -- unknown connection status  IL -- no incoming label
MM -- MTU mismatch               MI -- Mesh-Group ID not available
BK -- Backup connection	         ST -- Standby connection
PF -- Profile parse failure      PB -- Profile busy
RS -- remote site standby	 SN -- Static Neighbor
LB -- Local site not best-site   RB -- Remote site not best-site
VM -- VLAN ID mismatch

Legend for interface status 
Up -- operational           
Dn -- down

Instance: Master
  Local site: Master (1)
    connection-site           Type  St     Time last up          # Up trans
    2                         rmt   Up     Jun  4 12:36:46 2016           2
      Remote PE: 192.168.2.2, Negotiated control-word: Yes (Null)
      Incoming label: 800001, Outgoing label: 800000
      Local interface: xe-0/1/0.0, Status: Up, Encapsulation: ETHERNET

[email protected]> show bgp summary 
Groups: 1 Peers: 2 Down peers: 0
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
inet.0               
                       0          0          0          0          0          0
inet6.0              
                       0          0          0          0          0          0
bgp.l2vpn.0          
                       1          1          0          0          0          0
Peer                     AS      InPkt     OutPkt    OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
192.168.2.2             100       3234       3229       0       1  1d 0:19:17 Establ
  inet.0: 0/0/0/0
  inet6.0: 0/0/0/0
  Master.l2vpn.0: 1/1/1/0
  bgp.l2vpn.0: 1/1/1/0
192.168.2.3             100       5735       5724       0       1 1d 19:06:59 Establ
  inet.0: 0/0/0/0
  inet6.0: 0/0/0/0
  Master.l2vpn.0: 0/0/0/0
  bgp.l2vpn.0: 0/0/0/0/

[email protected]> show route table Master.l2vpn.0 

Master.l2vpn.0: 2 destinations, 2 routes (2 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100:1:1:1/96                
                   *[L2VPN/170/-101] 1d 20:37:12, metric2 1
                      Indirect
100:2:2:1/96                
                   *[BGP/170] 00:01:02, localpref 100, from 192.168.2.2
                      AS path: I, validation-state: unverified
                    > to 192.168.1.14 via xe-1/0/0.0, Push 0
                      to 192.168.1.6 via xe-1/0/2.0, Push 300000

[email protected]> show route table mpls.0 protocol l2vpn    

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

800001             *[L2VPN/7] 23:30:09
                    > via xe-0/1/0.0, Pop       Offset: 4
xe-0/1/0.0         *[L2VPN/7] 00:06:20, metric2 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 800000 Offset: 252
                      to 192.168.1.6 via xe-1/0/2.0, Push 800000, Push 300000(top) Offset: 252

From the end host point of view, we have end-to-end connectivity 😀

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.431/0.637/0.843/0.206 ms
Note
The route given from the show route table Master.l2vpn.0 is the Route Distinguisher of the other end of the pseudowire

Martini

Martini signalling uses LDP, as stated before, and with LDP enabled already, I will focus on the actual configuration, which is done within the protocol l2circuit stanza. Compared to Kompella, the configuration for Martini style of L2VPNs is much simpler. All that is needed is for:

  • The remote neighbour to be defined. In my example I will be using the loopback address SiteA as the remote neighbour
  • The customer facing interface connecting into the VPN
  • Set a circuit ID, that must match on both sides

All this can be done in one line!

set protocols l2circuit neighbor 192.168.2.2 interface xe-0/1/0.0 virtual-circuit-id 1

With that we have Martini style L2VPN configured 🙂

Verifications

To check the status of Martini style L2VPN, you will use show l2circuit connections, the output is near enough the same as show l2vpn connections. Martini, as discussed above, uses LDP for the signalling, we will be able to use show ldp neighbor to check that the neighbour relationship with the remote side has been successful and we will be able to check the LDP database by using show ldp database to verify that new labels associated with the pseudowire (L2CKT) has been installed into the database. Additionally you can check the inet.3 and mpls.0 routing tables, by using show route table inet.3 & show route table mpls.0

Show l2circuit Connectionsshow ldp neighborshow ldp databaseshow route table inet.3show route table mpls.0
[email protected]> show l2circuit connections 
Layer-2 Circuit Connections:

Legend for connection status (St)   
EI -- encapsulation invalid      NP -- interface h/w not present   
MM -- mtu mismatch               Dn -- down                       
EM -- encapsulation mismatch     VC-Dn -- Virtual circuit Down    
CM -- control-word mismatch      Up -- operational                
VM -- vlan id mismatch		 CF -- Call admission control failure
OL -- no outgoing label          IB -- TDM incompatible bitrate 
NC -- intf encaps not CCC/TCC    TM -- TDM misconfiguration 
BK -- Backup Connection          ST -- Standby Connection
CB -- rcvd cell-bundle size bad  SP -- Static Pseudowire
LD -- local site signaled down   RS -- remote site standby
RD -- remote site signaled down  HS -- Hot-standby Connection
XX -- unknown

Legend for interface status  
Up -- operational            
Dn -- down                   
Neighbor: 192.168.2.2 
    Interface                 Type  St     Time last up          # Up trans
    xe-0/1/0.0(vc 1)          rmt   Up     Jun  5 14:03:37 2016           1
      Remote PE: 192.168.2.2, Negotiated control-word: Yes (Null)
      Incoming label: 300000, Outgoing label: 300016
      Negotiated PW status TLV: No
      Local interface: xe-0/1/0.0, Status: Up, Encapsulation: ETHERNET
      Flow Label Transmit: No, Flow Label Receive: No

[email protected]> show ldp neighbor      
Address            Interface          Label space ID         Hold time
192.168.2.2        lo0.0              192.168.2.2:0            43
192.168.1.6        xe-1/0/2.0         192.168.2.3:0            14
192.168.1.14       xe-1/0/0.0         192.168.2.2:0            13

[email protected]> show ldp database                             
Input label database, 192.168.2.1:0--192.168.2.2:0
  Label     Prefix
 299984      192.168.2.1/32
      0      192.168.2.2/32
 300000      192.168.2.3/32
 300016      L2CKT CtrlWord ETHERNET VC 1

Output label database, 192.168.2.1:0--192.168.2.2:0
  Label     Prefix
      0      192.168.2.1/32
 299968      192.168.2.2/32
 299984      192.168.2.3/32
 300000      L2CKT CtrlWord ETHERNET VC 1

Input label database, 192.168.2.1:0--192.168.2.3:0
  Label     Prefix
 300016      192.168.2.1/32
 300000      192.168.2.2/32
      0      192.168.2.3/32

Output label database, 192.168.2.1:0--192.168.2.3:0
  Label     Prefix
      0      192.168.2.1/32
 299968      192.168.2.2/32
 299984      192.168.2.3/32

[email protected]> show route table inet.3 192.168.2.2 

inet.3: 3 destinations, 4 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

192.168.2.2/32     *[LDP/9] 1d 21:16:06, metric 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 0
                      to 192.168.1.6 via xe-1/0/2.0, Push 300000
                    [RSVP/10/1] 1d 01:14:11, metric 100
                    > to 192.168.1.6 via xe-1/0/2.0, label-switched-path to-siteA

[email protected]> show route table mpls.0 protocol l2circuit 

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

300000             *[L2CKT/7] 00:05:13
                    > via xe-0/1/0.0, Pop       Offset: 4
xe-0/1/0.0         *[L2CKT/7] 00:05:13, metric2 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 300016 Offset: 252
                      to 192.168.1.6 via xe-1/0/2.0, Push 300016, Push 300000(top) Offset: 252

From the end host point of view, connectivity between the two is there 🙂

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.358/0.532/0.707/0.176 ms

Circuit Cross-Connect

As CCC doesn’t support stacked labels unlike Kompella and Martini, we will need to configure 2 static LSPs between the PE routers. CCC needs to have a LSP for to transmit and another to receive traffic. So firstly, we will need to get the LSPs configured. The received LSP will be configured on the remote PE, so under protocols mpls label-switched-path stanza, this is where we will define the LSP. I've used the loopback address of the remote end with the underlying IGP working out the best path.

set protocols mpls label-switched-path to-siteA to 192.168.2.2
set protocols mpls label-switched-path to-siteA no-cspf

With the LSPs configured, we will need to go under the protocol connections stanza. We need to define the customer facing interface(s) that will be connecting into the VPN, then set the transmit LSP and receive LSP, this will be the name of the LSP set on the remote end.

set protocols connections remote-interface-switch siteA interface xe-0/1/0.0
set protocols connections remote-interface-switch siteA transmit-lsp to-siteA
set protocols connections remote-interface-switch siteA receive-lsp to-Master

With that we are sorted!

Verifications

In regards with CCC there's less show commands, from what I’ve found (let me know if there's more please), but we can check the pseudowire's status by using show connections. We can confirm the Transmit (Ingress) and Receive (Egress) LSP using show mpls lsp and finally, we will be able to mpls.0 table to confirm that the L2VPN incoming label and interface(s) for the pseudowire have made the routing table, by using show route table mpls.0.

Show Connectionsshow mpls lspshow route table mpls.0
[email protected]> show connections 
CCC and TCC connections [Link Monitoring On]
Legend for status (St):             Legend for connection types:
 UN -- uninitialized                 if-sw:  interface switching
 NP -- not present                   rmt-if: remote interface switching
 WE -- wrong encapsulation           lsp-sw: LSP switching
 DS -- disabled                      tx-p2mp-sw: transmit P2MP switching
 Dn -- down                          rx-p2mp-sw: receive P2MP switching
 -> -- only outbound conn is up     Legend for circuit types:
 <- -- only inbound  conn is up      intf -- interface
 Up -- operational                   oif  -- outgoing interface
 RmtDn -- remote CCC down            tlsp -- transmit LSP
 Restart -- restarting               rlsp -- receive LSP


Connection/Circuit                Type        St      Time last up     # Up trans
siteA                             rmt-if      Up      Jun  3 12:42:55           1
  xe-0/1/0.0                        intf  Up
  to-siteA                          tlsp  Up
  to-Master                         rlsp  Up

[email protected]> show mpls lsp                           
Ingress LSP: 1 sessions
To              From            State Rt P     ActivePath       LSPname
192.168.2.2     192.168.2.1     Up     0 *     to-siteA         to-siteA
Total 1 displayed, Up 1, Down 0

Egress LSP: 1 sessions
To              From            State   Rt Style Labelin Labelout LSPname 
192.168.2.1     192.168.2.2     Up       0  1 FF  300080        - to-Master
Total 1 displayed, Up 1, Down 0

Transit LSP: 0 sessions
Total 0 displayed, Up 0, Down 0

[email protected]> show route table mpls.0 protocol ccc    

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

300080             *[CCC/7] 00:00:04
                    > via xe-0/1/0.0, Pop      
xe-0/1/0.0         *[CCC/10/1] 00:00:04, metric 100
                    > to 192.168.1.14 via xe-1/0/0.0, label-switched-path to-siteA

Finally to confirm end-to-end reachability between the end hosts

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.358/0.532/0.707/0.176 ms

I had planned to have a wee bit more to this post, with what I was actually testing ,however, this is getting a bit longer than I expected, so I'll make this into a two-part 😉

My next post will detail, how you can use traffic engineering to manipulate a L2VPN path between 2 PE routers! Hope to see you there 😀

References

Darren's Blog L2VPN in Junos
RFC4665
MPLS l2VPN
RFC6624
RFC6074
Vlan based CCC L2vpn

Share this:
Share

Upgrading Dual Routing Engine Juniper MX Series

Reading Time: 4 minutes

In one of my previous post, I explained how you would go about upgrading a Juniper EX switch. I said whenever I got the chance to upgrade a MX Series Router, I’ll get something noted down…… *raises hands* today is the day! As I’ve said in a few posts, there has been a lot of change and now team is now getting access to the Core Juniper MX Series Routers. As part of this increased access, one of our first tasks is it upgrade JunOS from 12.3R5.7 to 14.1R6.4. With most, if not, all MX Series above the MX80, they will come with two Routing Engines (RE), and both are independent of each other. This is being the case, when upgrading a MX; you will need to upgrade each RE by individually.

This post will go over what you will need to do upgrade an MX Router, in my setup I’ll be upgrading a Juniper MX480 Router and I’ll be doing the upgrade via the console port on each Routing Engine.

To link two Routing Engines together, you would need to apply similar configuration to what I used:

set groups re0 system host-name re0-mx480
set groups re0 interfaces fxp0 unit 0 family inet address x.x.x.x/x
set groups re1 system host-name re1-mx480
set groups re1 interfaces fxp0 unit 0 family inet address x.x.x.x/x
set apply-groups re1
set apply-groups re0

set chassis redundancy graceful-switchover
set routing-options nonstop-routing
set system commit synchronize

With that all cleared up.. Let’s get cracking 🙂

Pre Works

Upload the new firmware version to wherever you normally keep it them. Currently, we would normally upload the package into the /var/tmp folder on the device in question

[[email protected] ~]$ scp jinstall-14.3R6.4-domestic-signed.tgz re0-mx480:/var/tmp

After just saying how you to link the two REs together, for an upgrade, you will need to disable graceful-switchover and nonstop-routing. Skipping this step can potentially result in the control plane and forwarding plane having two different JUNOS versions, which can cause a number of potential issues!

deactivate chassis redundancy graceful-switchover
deactivate routing-options nonstop-routing

Upgrade Process

Having disabled both graceful-switchover and nonstop-routing, log onto the Backup RE, either by console or from the Master RE run the command request routing-engine login re1. Once on the Backup RE, you will need to run the command request system software validate add /var/tmp/xxx reboot.

[email protected]> request system software validate add /var/tmp/jinstall-14.1R6.4-domestic-signed.tgz reboot
NOTE
If you’re like us and save the new firmware package to the local device, when you run the software add command DO NOT set what RE has the package stored. If you do you add the package’s location once the upgrade is completed, on one of the RE, it will delete the image from the device!
Additionally you had requested a session from re0 to re1 to connect to Backup RE, once the RE reboots, you get this message and get booted off

[email protected]>                                                                                
*** FINAL System shutdown message from [email protected] ***                 

System going down IMMEDIATELY                                                  

                                                                               
rlogin: connection closed

If you have console access, you can watch the upgrade ticking along, if you don’t, you can confirm the Backup RE is up and running by using the command show chassis routing-engine, it will show the status and hardware stats for both Routing Engines.

show chassis routing-engine output
[email protected]> show chassis routing-engine    
Routing Engine status:
  Slot 0:
    Current state                  Master
    Election priority              Master (default)
    Temperature                 31 degrees C / 87 degrees F
    CPU temperature             37 degrees C / 98 degrees F
    DRAM                      3584 MB (3584 MB installed)
    Memory utilization          20 percent
    CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     4 percent
      Interrupt                  0 percent
      Idle                      96 percent
    Model                          RE-S-2000
    Serial ID                      9012021718
    Start time                     2016-03-22 13:14:37 GMT
    Uptime                         3 hours, 38 minutes, 16 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.01       0.01       0.00
Routing Engine status:
  Slot 1:
    Current state                  Backup
    Election priority              Backup (default)
    Temperature                 33 degrees C / 91 degrees F
    CPU temperature             38 degrees C / 100 degrees F
    DRAM                      3584 MB (4096 MB installed)
    Memory utilization          16 percent
    CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     0 percent
      Interrupt                  0 percent
      Idle                     100 percent
    Model                          RE-S-2000
    Serial ID                      9012022174
    Start time                     2016-03-22 16:47:56 GMT
    Uptime                         4 minutes, 45 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.34       0.47       0.23

Having upgraded Backup RE, to reduce the downtime and service impact, you can failover the Master Routing Engine, so that the Backup becomes the new Master Routing Engine. This is an manual process by running the command, from the current Master RE, request chassis routing-engine master switch. This WILL cause a brief outage as the PFE is reset and the new firmware is loaded.

[email protected]> request chassis routing-engine master switch    
warning: Traffic will be interrupted while the PFE is re-initialized
Toggle mastership between routing engines ? [yes,no] (no) yes 

Resolving mastership...
Complete. The other routing engine becomes the master.

You can confirm by running show chassis routing-engine again on RE0

AFTER failing over Routing Engine
[email protected]> show chassis routing-engine 
Routing Engine status:
  Slot 0:
    Current state                  Backup
    Election priority              Master (default)
    Temperature                 32 degrees C / 89 degrees F
    CPU temperature             39 degrees C / 102 degrees F
    DRAM                      3584 MB (3584 MB installed)
    Memory utilization          16 percent
    CPU utilization:
      User                       2 percent
      Background                 0 percent
      Kernel                     1 percent
      Interrupt                  0 percent
      Idle                      97 percent
    Model                          RE-S-2000
    Serial ID                      9012021718
    Start time                     2016-03-22 13:14:37 GMT
    Uptime                         3 hours, 50 minutes, 7 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.21       0.07       0.02
Routing Engine status:
  Slot 1:
    Current state                  Master
    Election priority              Backup (default)
    Temperature                 33 degrees C / 91 degrees F
    CPU temperature             41 degrees C / 105 degrees F
    DRAM                      3584 MB (4096 MB installed)
    Memory utilization          22 percent
    CPU utilization:
      User                      43 percent
      Background                 0 percent
      Kernel                    28 percent
      Interrupt                  0 percent
      Idle                      29 percent
    Model                          RE-S-2000
    Serial ID                      9012022174
    Start time                     2016-03-22 16:47:56 GMT
    Uptime                         16 minutes, 42 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       4.71       1.11       0.46

Having failed over the RE now, all that needed is to repeat the same command as before request system software validate add /var/tmp/xxx reboot to install the new firmware on RE0

[email protected]> request system software add /var/tmp/jinstall-14.1R6.4-domestic-signed.tgz reboot

Once the Routing-Engine has upgraded you need to re-enable graceful-switchover and non-routing, first, you will see why in moment.

activate chassis redundancy graceful-switchover
activate routing-options nonstop-routing

After commit synchronising those changes, you will need to set RE0 back to the Master Routing Engine. This will be done by running request chassis routing-engine master switch from RE1 now.

[email protected]>request chassis routing-engine master switch 
Toggle mastership between routing engines ? [yes,no] (no) yes 

Resolving mastership...
Complete. The other routing engine becomes the master.

{backup}
[email protected]> 

Now that Graceful Switchover is enable when you run the command you wont see the same warning about traffic being disrupted, this is because Graceful Switchover preserves interface and kernel information allowing the PFE to continue forwarding packets, even though one of the REs is unavailable.

NOTE
More detail Graceful Switchover can be found on Juniper’s TechLibrary

We will need to save a backup of the currently running and active file system by issuing the command request system snapshot on both primary as well as backup REs

[email protected]> request system snapshot 
Doing the initial labeling...
Verifying compatibility of destination media partitions...
Running newfs (899MB) on hard-disk media  / partition (ad2s1a)...
Running newfs (100MB) on hard-disk media  /config partition (ad2s1e)...
Copying '/dev/ad0s1a' to '/dev/ad2s1a' .. (this may take a few minutes)
Copying '/dev/ad0s1e' to '/dev/ad2s1e' .. (this may take a few minutes)
The following filesystems were archived: / /config

Finally, confirm the code version by running show version invoke-on all-routing-engine. I’ve used the commands show version and show version invoke-on other-routing-engine just because I can put the two outputs into a table-like thing and it looks neater in a post :p

show version RE0show version RE1
{master}
[email protected]> show version          
Hostname: RE0-MX480-02
Model: mx480
Junos: 14.1R6.4
JUNOS Base OS boot [14.1R6.4]
JUNOS Base OS Software Suite [14.1R6.4]
JUNOS Packet Forwarding Engine Support (M/T/EX Common) [14.1R6.4]
JUNOS Packet Forwarding Engine Support (MX Common) [14.1R6.4]
JUNOS platform Software Suite [14.1R6.4]
JUNOS Runtime Software Suite [14.1R6.4]
JUNOS Online Documentation [14.1R6.4]
JUNOS Services AACL Container package [14.1R6.4]
JUNOS AppId Services [14.1R6.4]
JUNOS Services Application Level Gateways [14.1R6.4]
JUNOS Services Captive Portal and Content Delivery Container package [14.1R6.4]
JUNOS Border Gateway Function package [14.1R6.4]
JUNOS Services HTTP Content Management package [14.1R6.4]
JUNOS IDP Services [14.1R6.4]
JUNOS Services LL-PDF Container package [14.1R6.4]
JUNOS Services Jflow Container package [14.1R6.4]
JUNOS Services MobileNext Software package [14.1R6.4]
JUNOS Services Mobile Subscriber Service Container package [14.1R6.4]
JUNOS Services PTSP Container package [14.1R6.4]
JUNOS Services NAT [14.1R6.4]
JUNOS Services RPM [14.1R6.4]           
JUNOS Services Stateful Firewall [14.1R6.4]
JUNOS Voice Services Container package [14.1R6.4]
JUNOS Services Crypto [14.1R6.4]
JUNOS Services SSL [14.1R6.4]
JUNOS Services IPSec [14.1R6.4]
JUNOS py-base-i386 [14.1R6.4]
JUNOS Kernel Software Suite [14.1R6.4]
JUNOS Crypto Software Suite [14.1R6.4]
JUNOS Routing Software Suite [14.1R6.4]
{master}
[email protected]> show version invoke-on other-routing-engine 
re1:
--------------------------------------------------------------------------
Hostname: RE1-MX480-02
Model: mx480
Junos: 14.1R6.4
JUNOS Base OS boot [14.1R6.4]
JUNOS Base OS Software Suite [14.1R6.4]
JUNOS Packet Forwarding Engine Support (M/T/EX Common) [14.1R6.4]
JUNOS Packet Forwarding Engine Support (MX Common) [14.1R6.4]
JUNOS platform Software Suite [14.1R6.4]
JUNOS Runtime Software Suite [14.1R6.4]
JUNOS Online Documentation [14.1R6.4]
JUNOS Services AACL Container package [14.1R6.4]
JUNOS Services Application Level Gateways [14.1R6.4]
JUNOS AppId Services [14.1R6.4]
JUNOS Services Captive Portal and Content Delivery Container package [14.1R6.4]
JUNOS Border Gateway Function package [14.1R6.4]
JUNOS Services HTTP Content Management package [14.1R6.4]
JUNOS Services Jflow Container package [14.1R6.4]
JUNOS IDP Services [14.1R6.4]
JUNOS Services LL-PDF Container package [14.1R6.4]
JUNOS Services MobileNext Software package [14.1R6.4]
JUNOS Services Mobile Subscriber Service Container package [14.1R6.4]
JUNOS Services NAT [14.1R6.4]           
JUNOS Services RPM [14.1R6.4]
JUNOS Services PTSP Container package [14.1R6.4]
JUNOS Services Stateful Firewall [14.1R6.4]
JUNOS Voice Services Container package [14.1R6.4]
JUNOS Services SSL [14.1R6.4]
JUNOS Services Crypto [14.1R6.4]
JUNOS Services IPSec [14.1R6.4]
JUNOS py-base-i386 [14.1R6.4]
JUNOS Kernel Software Suite [14.1R6.4]
JUNOS Crypto Software Suite [14.1R6.4]
JUNOS Routing Software Suite [14.1R6.4]

And with that, we have an upgraded Dual Routing Engine MX Series router! 😀 Yay! I’ll most likely, now that I’ve got the access, mess about with ISSU upgrade on MX next so keep an eye for that one!

Reference

Configuring Dual Routing Engines MX Series
Procedure to Upgrade JUNOS on a Dual Routing Engine System
Understanding Graceful Switchover (GRES)

Share this:
Share

Open Shortest Path First Notes

Reading Time: 15 minutes

As part of studies, this post will be my notes on the Routing Protocol Open Shortest Path First

OSPF Basics

  • What is OSPF?
  • OSPF Structure

Neighbour Discovery

  • Inter-Node Communication
  • OSPF Packet Details
  • OSPF Hello Messages Details
  • Router-ID Selection Process
  • OSPF Neighbour Adjacency Process
  • Designated Router & Backup Designated Router
  • Designated Router Election

Network Types

  • Broadcast
  • Non-Broadcast Multi-Access
  • Point-to-Point
  • Point-to-Multipoint
  • Loopback

Scaling OSPF

  • Areas
  • Router Types
  • OSPF Route Types
  • Link-State Advertisement Types
  • Area Types

OSPF Basics

What is OSPF

Open Shortest Path First (OSPF) is an Open-Standard Interior Gateway Protocol (IGP) routing protocol. Unlike other Routing Protocols such as Routing Information Protocol (RIP), Enhanced Interior Gateway Routing Protocol (EIGRP) or Border Gateway Protocol (BGP), OSPF uses the Link State Algorithm in conjunction with Edsger W. Dijkstra Shortest Path First (SPF) algorithm to send out OSPF advertisements, known as Link-State Advertisements (LSAs), to share its Local Link-State Database (LSDB) with OSPF enabled devices to create an overall topology of every router, link state and link metric within a network. OSPF is defined in RFC2328:

OSPF is a link-state routing protocol. It is designed to be run internal to a single Autonomous System. Each OSPF router maintains an identical database describing the Autonomous System’s topology. From this database, a routing table is calculated by constructing a shortest-path tree.

OSPF recalculates routes quickly in the face of topological changes, utilizing a minimum of routing protocol traffic. OSPF provides support for equal-cost multipath. An area routing capability is provided, enabling an additional level of routing protection and a reduction in routing protocol traffic. In addition, all OSPF routing protocol exchanges are authenticated.

OSPF advertises and receives LSAs to/from neighbouring routers; these LSAs are stored with the router’s local LSDB. Whenever there is a change in the network new LSA’s will be flooded across the routing domain and all the routers will have to update their LSDB. This is due to the nature of the Link State and SPF Algorithms; essentially all OSPF routers have to same synchronized identical copy of the Link State Database to have a complete loop-free map of the network topology.

OSPF Structure

OSPF can be described as a two-tier hierarchical structure. This is because you have two main area types: Backbone Area and Non-Backbone Areas. The Backbone Area is known as Area 0 and Non-Backbone Areas are all other Areas. All Non-Backbone Areas MUST connect to Area 0. It is important to note, that OSPF routers in different Areas DO NOT have the same synchronized identical copy of each Link State Database however routers within the same Area will have an identical Link State Database. This is because; Area 0 provides transit for All Non-Backbone Areas. Non-Backbone Areas advertise their routes into Area 0 and Area 0 will advertise all routes learnt to the other Areas, as shown here

Neighbour Discovery

Inter-Node Communication

Communication between OSPF routers is done, dependent on network type, over IP using it own protocol number 89 sending multicast OSPF packets between each other. There are two multicast addresses that have been defined for OSPF enabled routers/interfaces to dynamically find neighbours. RFC2328 defines them as:

AllSPFRouters: This multicast address has been assigned the value 224.0.0.5. All routers running OSPF should be prepared to receive packets sent to this address. Hello packets are always sent to this destination. Also, certain OSPF protocol packets are sent to this address during the flooding procedure.

AllDRouters: This multicast address has been assigned the value 224.0.0.6. Both the Designated Router and Backup Designated Router must be prepared to receive packets destined to this address. Certain OSPF protocol packets are sent to this address during the flooding procedure.

OSPF Packet Details

As stated above, OSPF has it own dedicated IP protocol as reserved by Internet Assigned Number Authority (IANA) within the protocol, OSPF exchanges 5 types of packets:

Type Packet Name Packet Function
1 Hello Discovers and Maintains Neighbours
Hello are sent to ensure that neighbours are still available and online
2 Database Description
(DBD/DDP)
Summarize Database contents
When an adjacency is being formed, this packet will describe the content of the Link-State Database being received
3 Link-State Request (LSR) Database Download
These are used to request more detail about a portion of LSDB from one router to another, when some details are regarded as stale
4 Link-State Update (LSU) Database Update
This packet is normally in response to LSR packet, it provides an update to the LSDB as requested by a neighbour
5 Link-state Ack Flooding Acknowledgment
When the router receives a LSA flood, it will response to the flood to ensure OSPF reliable

OSPF Hello Messages Details

As stated earlier, an OSPF Packet will be exchanged between routers to allow them to have the same synchronizes OSPF database. For Adjacency discovery and maintenance; an OSPF Hello Message is flooded to all enabled interfaces, two routers that have the same matching hello messages will create an OSPF adjacency. The table below shows all the parameters that are within a Hello Message, with the first eight parameters needing to match for an adjacency to form:

Parameter Function
Hello Interval Amount of time between hello packets being sent and recieved
Dead Interval How long to wait between hello packets before marking the neighbour as dead, by default the dead interval is 4x the hello interval. Essentially, the router can miss for hello interval before updating that the neighbour is down
Area ID Both neighbour in the same OSPF Area.
Subnet Mask This is for connectivity both neighbours will need to be in the same subnet
Stub Area Flag This is for when the neighbour has been defined as Stub Area. Within OSPF all Areas that have been defined as Stub Areas mark their hello messages with the Stub Flag
Authentication Securing communication between neighbours. This can be configured with None, Clear Text or MD5
OSPF Router ID An unique 32-bit ID number that’s set in dotted-decimal format
Maximum Transmission Unit (MTU) As OSPF doesn’t support packet fragmentation, the MTU must be the same on both side.
From my experiences this is only changed if you are using Jumbo Packet sizing
Router Priority Used to determine Designated and Backup Designated Routers
Designated Router &
Backup Designated Router
The IP addresses of the Designated and Backup Designated Routers
Active Neighbours List of all the neighbours (the router) has recieved a Hello Message from, within the dead interval

OSPF uses its ALLSPFRouters address to send out hello messages across all OSPF enabled interfaces. It is important to add that if you have an interface that has been set as a passive OSPF interface, this interface will still be advertised into an OSPF routing domain however hello messages ARE NOT sent out. From my experiences this is commonly used on loopback address or external/customer facing interfaces. As you would want to advertise the subnet into OSPF however you wouldn’t want to have start an OSPF Neighbour Relationship between your ISPs or Customers.

The OSPF Router-ID is an important attribute when it comes to identifying a router within the OSPF domain. Each OSPF router has a Router-ID that is associated with the OSPF process, so it is possible to have to have two different processes active on single router with two different Router-IDs. The OSPF Router-ID has to be configured in 32-bit dotted decimal format, this is case whether you are using OSPFv2 (IPv4) or OSPFv3 (IPv4 and IPv6). As discussed in RFC2328

As each router will be getting an ID number, it is important to note, that these IDs have to be unique and no neighbour in the same OSPF domain can have same Router-ID. If two routers were to have same Router-ID, they wouldn’t be able to create a neighbour relationship. Additionally other neighbours peered with the both will have an issues with OSPF updates that come from the same Router-ID however the link-state databases are different, this can cause OSFP Flood War

OSPF Router-ID Selection Process

The process of selecting the Router-ID within OSPF follows this order:

  1. Hard Coding the Router-ID: If the Router-ID manually configured under the OSPF process this take precedence over everything. This is recommended and best practice
  2. Highest Logical IP Address: This will be the highest loopback address configured on the router
  3. Highest Active Physical IP address: This will be the highest IP address configured on a physical interface on the router

If you don’t hard code the router-id you will need to always remember, when you are making IP address updates on the router if you configure a new loopback or interface IP address that is higher than the currently OSPF Router-ID, it will change the Router-ID and can cause OSPF re-convergence, if the process is cleared or the device is reloaded.

OSPF Neighbour Adjacency Process

With OSPF, unlike, other IGPs has 2 Neighbour Adjacency states:
OSPF Neighbours: OSPF Neighbours are when two routers/devices have stop at the 2-Way neighbour state. At this state the neighbours bidirectional connectivity and all the OSPF parameters match. But it is important to note that the neighbours DO NOT exchange their link-state databases at this state.

OSPF Fully Adjacent Neighbours: OSPF Fully Adjacent Neighbours is when the two routers have the same bidirectional connectivity and all OSPF parameters match, however with Fully Adjacent Neighbours, each router will exchange their full link-state database with its neighbours and advertise the relationship in a link-state update packets.

Within OSPF there are 8 neighbour states that two neighbours can go through to become Fully Adjacent Neighbours. These states are:

State Description
Down This is the start state of neighbour communications. No Hello Messages have been exchanged
Attempt This state is valid only for Non-Broadcast Multi-Access (NBMA) networks. It is when a hello packet has not been received from the neighbour and the local router is going to send a unicast hello packet to that neighbour within the specified hello interval period.
Init The router has received a Hello Message from a neighbour, but has not received its own Router-ID from the neighbour. This means that Bidirectional communications have not been established yet.
2-Way Bidirectional communication between the neighbours have been established, no Link State information has been exchanged. At this state an OSPF Neighbourship has been created
ExStart This is where the neighbours start the process of becoming Fully Adjacent OSPF Neighbours and exchange Link State Databases
Exchange At this state, Link State Database details has been sent to the adjacent neighbour. At this state, a router is capable to exchange all OSPF routing protocol packets.
Loading At this state, the neighbour has exchanged its own LSDB, however has not fully requested/received LSA’s from its neighbour
Full Both LSDB’s have been exchanged and are fully synchronized. Each neighbour will have the full OSPF Network Topology available now

Designated Router & Backup Designated Router

OSPF has the concept of Designated and Backup Designated Routers (DR and BDR) for Multi-Access Networks that use technologies such as Ethernet and Frame Relay, as on the LAN you can have more than two OSPF enabled router. By having DR and BDRs, it assists in scalable of an OSPF segment, in addition to reducing OSPF LSA flooring across the network. This is because the other routers (OSPF DROthers) on the LAN, only create a Full OSPF Adjacency with the DR and BDR rather than with other DRothers. The DR is the solely responsible for flooding the LAN with LSA updates during a topology change. The flooring by the DR is controlled, as stated above, by the AllSPFRouters and AllDRouters multicast addresses. DR will flood LSAs to the AllSPFRouters destination address to communicate with other routers on the LAN; and DROthers will communicate their LSAs to DR and BDR using the AllDRouters destination address.

As the name suggests the BDR role is to be the secondary router in case the DR was the fail or be un-contactable, it will take over as the DR and another BDR will be elected. The BDR has a full OSPF Adjacency just like the DROthers with the BR, however unlike them, the BDR can listen on the ALLDRouters address. This means, in a situation of a DR failure, the BDR can take over as DR quicker and there will be less re-convergence across the network, as it already synchronized to the DR and the DROthers as they will all have the same LSDB.

Designated Router Election Method

The DR/BDR Election process is done during the 2-Way State, where bidirectional communications has been established between the routers and have received Hello Messages. OSPF uses Interface Priority and Router-ID to determine, which routers will be elected as DR and BDR. An OSPF router can have its interface priority set between 0-255, (an interface priority set to 0 means it is prohibited from entering DR/BDR election process) with the highest priority taking the role as the DR and the secondary highest priority becoming the BDR. If the priorities are all the same, the highest Router-ID will be used as the tiebreaker.

By default, OSPF’s priority is 1 on Cisco IOS/XR and 128 on Juniper. With Cisco IOS XR, you are able to set the priority for all interface within an area globally and under the interface, whereas Junos and Cisco IOS you can only set priority under the interface.

IMPORTANT NOTE
If an OSPF router receives a Hello Packet with the Router-ID for the DR or BDR isn’t 0.0.0.0, it will assume that DR and BDR have been elected already and will become a DROther.

Network Types

Depending on what the Layer-2 topology looks like within a network can have affect on the behaviour of OSPF. A Topology that uses Ethernet commonly allow multiple node on a LAN, in this case a Designated Router (DR) and Backup Designated Router (BDR) are used to cut down the OSPF LSA flooding, due to both supporting broadcast domains. Whereas other media such as serial links or Frame Relay don’t support broadcast domains meaning DR/BDR are not needed.

With this in mind OSPF has 5 different network types:

Broadcast

A Broadcast network is where an OSPF router is able to send a single message (broadcast message) that is able to communicate to more than 2 other OSPF routers on the same multi-access segment. i.e. Router A, B and C are connected to a Switch when Router A sends out a Hello Message it will be broadcasted across the segment via the Switch. With in this in mind, the need for DR/BDR will be required to control the LSA flooding across the segment. By default OSPF uses broadcast as the network type when configured on Ethernet LAN. The hello timers by 10/40 by default.

Non-Broadcast Multi-Access (NBMA)

This network type is used on links that do not support broadcast domain, media such as Frame Relay, ATM and X.25, or topologies like a hub and spoke where a router can connect to multiple nodes out of a single interface however isn’t fully meshed. A Non-Broadcast network will need to have DR/BDR configured, as you could have multiple nodes on the segment. However, Non-Broadcast network (as the name would suggest) doesn’t support broadcast or multicast, this means that OSPF’s normal way of sending hellos via the multicast address 224.0.0.5 to flood LAN looking for neighbours will not work. Instead it sends out unicast hello messages to statically configured neighbours. The hello timers are 30/120 by default.

Point-to-Point

This network type is commonly used when you only have two devices on the segment, ie if you have Router A connected to Router B using /31 or /30 that will be regarded as Point-to-Point (P2P) network. This network type doesn’t require DR/DBR as the two devices only have each other to communication and forming a DR/BDR would be a waste of Router resources. In addition, it important to note that P2P OSPF Adjacency form quicker as DR election is ignored and there is no wait timer. The hello timers by 10/40 by default and it supports OSPF Multicast Hello Messages.

Point-to-Multipoint

This network is commonly used when in a partially mesh network or hub and spoken network, where the Layer-2 topology doesn’t logically match the Layer-3 topology. I.e. in a hub and spoke or frame-relay network, Router A will be connected to Routers B and C, all on the same subnet, the Layer-3 will assume Routers B and C will be able directly connected on the same LAN, whereas the Layer-2 determines that Router B can only communicate with Router C by going via Router A. By using Point-to-Multipoint, it will advertise all each neighbour as a /32 endpoint forcing the Layer-3 routing to matches the Layer-2 by using Longest prefix match. The hello timers are 30/120 by default, doesn’t require DR/DBR and it supports OSPF Multicast Hello Messages.

Loopback

This network type is by default enabled on all loopback interfaces and can only be configured on loopback addresses. OSPF will always advertise loopback addresses as /32 route, even if the interface has been configured with a different prefix length. Hello messages, Timers and DR/BDR are not associated with Loopback network types.

Scaling OSPF

Areas

The wider a network gets, the wider OSPF domain will become. This can be an issue as all of these routers will need to maintain the same LSDB, and with a larger network more resources will be used processing LSA flooding and running SPF algorithm, which in turn will make the router run inefficient and possible start dropping packets. A way of easing this issue is to introduce OSPF Areas. OSPF Areas are used reduce the amount of the routers in a single area, in turn shrinking the LSDB size, restricts LSA flooding within/between areas, allows route summarization between Areas and increases SPF calculations. This is because routers maintain their own LSDB on a per-area basis. Essentially, Areas hide the their own topology and any LSA flooding or SPF calculations will same local to that area whilst the rest of the network stays unaware. Routers within the same area will have the same synchronized LSDB with Routers with interfaces in multiples area will hold LSDBs.

Router Types

Along with Area Types, OSPF has 4 different types of roles that an OSPF router could be, and dependent on the topology, multiple types at once. The table below describes the different Router types and you can see where each of these router types could sit within a simple topology here

Router Type Function
Backbone Router A router that is located and/or has a link(s) within Area 0 is known as a Backbone router. If this router has links to non backbone routers, it can also be known as an Internal router.
Internal Router An internal router is an OSPF router that only have links within a single area. If this router is within Area 0, it will also be known as Backbone Router.
Area Border Router (ABR) An Area Border Router (ABR) is a router that has links between 2 areas. ABRs are role is to inject routes from non-backbone areas into Backbone. For a router to be an ABR, it HAS to have a link to Area 0, if it doesn’t then it wont be an ABR. It is considered a member of all areas it is connected to. An ABR keeps multiple copies of the link-state database in memory, one for each area to which that router is connected.
Autonomous System Boundary Router (ASBR) An OSPF router that learns routes from external routing protocols (BGP, IS-IS, EIGRP, OSPF), Static Routes and/or both and injects them into OSPF via redistribution. ASBRs are special types of routers, as you have can ASBR that isn’t ABR as these ASBR functions are independent to ABR functions, but dependant on the topology, you could have router that is both an ASBR and ABR.

OSPF Route Types

OSPF has a unique relationship between how routes are exchanged between areas and how these routes are ranked in importance. There’s 3 types of the Routes that are exchanged within OSPF Inter-Area, Intra-Area and External Routes, and in regards with the External Routes, you have 2 different types of External Routes:

Intra-Area Routes: these are routes that are learnt from Routers that are within the same area. They are also known as internal routes
Inter-Area Routes: these are routes that have been learnt from different areas. These routes have been injected via an ABR. They are also known as summary routes.
External Routes: are routes that are learnt outside of the OSPF domain. These routes have been learnt via redistribution by an ASBR. External routes have 2 classifications Type 1 and Type 2.

  1. Type 1 Routes: Type 1 routes, metric value equals the Redistribution Metric + Total Path Metric. This means that the metric values will increase the further the route goes into the network from the injecting ASBR. Type 1 routes are also known as E1 and N1 External Routes
  2. Type 2 Routes: Type 2 routes, metric value is only the Redistribution Metric. This means that the metric value will stay the same, no matter the how far the route goes into the network (within in 30 hops) from the injecting ASBR. By default, type 2 is the metric type used by OSPF. Type 2 routes are also known as E2 and N2 External Routes
NOTE
The order of preference for these route types are as followed:

  1. Intra-Area
  2. Inter-Area
  3. External Type 1
  4. External Type 2

Link-State Advertisement Types

Devices in an OSPF domain use LSAs to build their local areas LSDB. These LSDBs are identical for devices in the same area and different areas and different router types can produce different type of LSAs. There is 11 types of LSAs however typically there are 6 LSAs that are commonly used and that should be known. These are:

Type 1 – Router

Every OSPF Router will advertise Type 1 Router LSA, these LSAs are used to essentially build the LSDB. Type 1 LSAs are entries that describe the interfaces and neighbours of each and every OSPF router within the same area. In addition, these LSAs ARE NOT forward outside its own area, making the intra-area topology invisible to other areas.

Type 2 – Network

A Type 2 Network LSA, are used over Broadcast OSFP domain with a DR. Network LSAs are always advertised by the DR and is used to identify all the routers (BDR and DRothers) across the multi-access segment. As with Type 1 LSAs, Network LSAs ARE NOT advertised outside of its own area, making the intra-area topology invisible to other areas.

Type 3 – Summary

Summary LSAs are the prefixes that are learnt from Type 1 and 2 LSAs and advertised by an ABR into other areas. ABRs DO NOT forward Type 1 and 2 LSAs to other areas, any Network and/or Router LSAs are received by an ABR, it will be converted into Type 3 LSA with Type 1 and 2 information referenced within. If an ABR receives a Type 3 LSA from a Backbone router, it will regenerate a new Type 3 LSA and list itself as the advertising router and forward the new Summary LSA to non-backbone area. This is how inter-area traffic is process via ABR.

Type 5 – External

An External Type 5 LSA are flooded throughout an OSPF domain when route(s) from another routing protocol is Redistributed via an ASBR. These LSAs are not associated to any area and are flooded unchanged to all areas, with the expectation to Stub and Not-So-Stubby Areas.

Type 4 – Autonomous System Boundary Router (ASBR) Summary

When a Type 5 LSAs is flooded to all areas, the next-hop information may not be available to other areas because the route(s) would have been redistributed from another routing protocol. To solve this ABR will flood the Router ID of the originating ASBR in a Type 4 ASBR Summary LSA. The link-state ID is the router ID of the described ASBR for type 4 LSAs. Essentially, any routes that are redistributed into OSPF, when, the first ABR receives the Type 5 LSA, it will generate and flood a Type 4 LSA.

Type 7 – Not So Stubby Area (NSSA) External

Routers in a Not-so-stubby-area (NSSA) do not receive external LSAs from Area Border Routers, but are allowed to send external redistributed routes to other areas. As ABR DO NOT advertise Type 7 LSAs outside of their local. The ABR will covert the Type 7 LSA into a Type 5 LSA and flood the Type 5 LSA across the OSPF domain, as normal.

In addition to the LSA types above, the other 6 LSA types that are within OSPF are:

  • Type 6 – Multicast Extension LSA
  • Type 8 – OSPFv2 External Attributes LSA, OSPFv3 Link-Local Only LSA
  • Type 9 – OSPFv2 Opaque LSA, OSPFv3 Intra-Area Prefix LSA
  • Type 10 – Opaque LSA
  • Type 11 – Autonomous System Opaque LSA

Types 9 – 11 are defined in RFC5250 and RFC2370. They are typically used as MPLS Traffic Engineering OSPF Extension. I personally, haven’t looked into as of yet however will update once I have done more reading into them.

Area Types

OSPF defines several special area types:

Backbone

As described earlier, the Backbone Area also know as Area 0, this is the most important area in OSPF and there always has to be a Backbone Area. The Backbone Area MUST connect to all areas, as non-backbone area have to use Area 0 as transit area to communicate to other non-backbone areas. This is because the Backbone has all the routing information inject into it and advertises them out. This design is important to prevent routing loops.

Stub Area

A Stub Area DOES NOT allow External Routes to be advertised within the area. This means when an ABR to a Stub Area receives a Type 5 (External) and Type 4 (ASBR Summary) LSAs, the ABR will generate a default route for the area as Type 3 Summary LSA.

Not So Stubby Area (NSSA)

A Not So Stubby Area are similar to Stub Areas as they DO NOT allow Type 5 External however unlike Stub Areas, Not So Stubby Areas DO redistributed external routes via an ASBR into the area. As described above when route is redistributed into the NSSA, a Type 7 NSSA External LSA is flooded throughout the area and once an ABR receives the Type 7 LSA, it is converted into a Type 5 LSA and flooded into other areas. It is important to add, by default the NSSA does not advertise a default route automatically when Type 5 or Type 7 LSAs are blocked by an ABR.

Totally Stubby Area (TSA)

A Totally Stubby Area DOES NOT allow any Inter-Area or External Routes to advertised with the area. Essentially, if a Type 3 Summary or Type 5 External LSA, by the ABR, it will generate default route and inject it to the area. Totally Stubby Areas only allow Intra-Area and Default Routes within the area. The only way for traffic to get routed outside of the area is a default route, which is the only Type-3 LSA, advertised into the area.

Totally Not So Stubby Area (TNSSA)

Totally Not So Stubby Areas DOES NOT permit Type 3 Summary, Type 4 ASBR and Type 5 External LSAs being received into the area. However just like a NSSA, it allows redistributed external routes into the area via an ASBR. Just like NSSA when route is redistributed into the NSSA, a Type 7 NSSA External LSA is flooded throughout the area and once an ABR receives the Type 7 LSA, it is converted into a Type 5 LSA and flooded into other areas, but unlike a NSSA when TNSSA ABR receives a Type 3 LSA from the backbone, it will automatically generate a default route and inject into the area.

Share this:
Share

Configuring IPv4 DHCP Juniper SRX

Reading Time: 1 minute

After configuring a Dual Stacked DHCP server and DHCPv6 on Juniper SRX, it’s only right that I did something on Configuring DCHPv4 on a Juniper SRX.

This wont be a long or detailed post, as the configuration is very much the same as my previous post on how to configure DHCPv6 on a SRX, and I’ve went thought quite a lot before about how DHCP works etc.

First, under the system services dhcp-local-server stanza, you will need to create group and set a physical or logical interface that will have DHCP enabled

[email protected]# show system services dhcp-local-server    
group dhcpv4-group {
    interface vlan.3407;
}

Next, under the access address-assignment stanza, you will need to set the network, the DHCP range and set the IP address that the router will be using within the DCHP pool. The propagate-settings will take configuration from the client DHCP on vlan.3407, if not otherwise specified, most importantly name-server which changes from ISP to ISP and is very important otherwise name resolutions on the LAN won’t work.

[email protected]# show access   
address-assignment {
    pool v4 {
        family inet {
            network 172.31.106.16/29;
            range v4-range {
                low 172.31.106.18;
                high 172.31.106.22;
            }
            dhcp-attributes {
                router {
                    172.31.106.17;
                }
                propagate-settings vlan.3407;
            }
        }
    }
}

This will be all configuration needed to have DHCPv4 on Juniper SRX220. For troubleshooting DHCP you will be able to use the commands below:

[email protected]> show dhcp ? 
Possible completions:
  client               Show DHCP client information
  relay                Show DHCP relay information
  server               Show DHCP server information
  snooping             Show DHCP snooping information
  statistics           Show DHCP service statistics

As I said, this is the quick post :p

I have included the set commands used in my example below:

DHCP Set Commands
set system services dhcp-local-server group dhcpv4-group interface vlan.3407
set access address-assignment pool v4 family inet network 172.31.106.16/29
set access address-assignment pool v4 family inet range v4-range low 172.31.106.18
set access address-assignment pool v4 family inet range v4-range high 172.31.106.22
set access address-assignment pool v4 family inet dhcp-attributes router 172.31.106.17
set access address-assignment pool v4 family inet dhcp-attributes propagate-settings vlan.3407
Share this:
Share