Category Archives: Networking

What is BGP FlowSpec?

I recently messed about with some Junos Automate Scripts that one of my colleagues had previously been working on, that could be used to add static routes to enable Remote Triggered Blackhole (RTBH) Filtering (which can be found here), and I found it was a bit rough around the edges (for people who aren’t cli junkies). As I do, I started looking into RTBH and saw that it’s a heavy-handed solution in trying to combat DDoS attacks against a network. RTBH technology has been around for a number of years now and has been defined in RFC 3882 and RFC 5635. In its most basic of terms, you can either blackhole all traffic from a source address and/or to a destination address by injecting the attacking/attacked prefix into BGP with a community that will rewrite the next-hop to a pre-configured discard route on edge routers. If you have massive DDoS trying to block every source address, it would be like going fishing with a shotgun. By blocking the destination address the attacker will have got their desired outcome. With that in mind, using RTBH is ideally a last resort solution. There is an alternative more subtle way of blocking unwanted attack traffic from our network. This alternative method is known as BGP FlowSpec.

What is BGP FlowSpec

BGP FlowSpec is defined in RFC 5575. RFC 5575 defines a new Multi-Protocol BGP Extension MP-BGP, in addition, with new Network Layer Reachability Information NLRI. The new NLRI collects 12 types of Layer 3 and Layer 4 details that are used to define a Flow Specification then actions are assigned to these routes dependant on the user’s needs. If you wanted to look at FlowSpec in a simple form, it is a firewall filter that is injected into BGP to filter out specific port(s) and protocol(s) just as a normal ACL would do. BGP uses NLRI to exchange routing details between BGP speakers, each of the MP-BGP Extensions have their own NLRI details that are identified by their Address Family Indicator AFI and Subsequent Address Family Indicator AFI. Usually IPv4 unicast routes (also known as BGP families) are the default for BGP peers, if non IPv4 unicast routes need to be exchanged ie IPv6, EVPN, L2VPN, FlowSpec routes, then MP-BGP defines the relevant NLRI of the router that should have the next-hop of the destination families. This had been defined in RFC 2858 and RFC 4760. As stated above, as of writing, there has been 12 NLRI types defined for BGP FlowSpec, these fields will be added to NLRI field within the BGP Update Message and advertised to peers. In addition, FlowSpec does not support IPv6 yet.

FlowSpec NLRI Types

These are the 12 FlowSpec NLRI types:

Type NLRI Component
1 Destination Prefix
Defines the destination prefix to match
2 Source Prefix
Defines the source prefix
3 IP Protocol
Contains a set of {operator, value} pairs that are used to match the IP protocol value byte in IP packets.
4 Port
This is defines whether TCP, UDP or both will be packets will be influenced
5 Destination Port
Defines the destination port that will be influenced by FlowSpec
6 Source Port
Defines the source port that will be influenced by FlowSpec
7 ICMP Type
8 ICMP Code
9 TCP flags
10 Packet Length
Match on the total IP packet length (excluding Layer 2 but including IP header)
11 DSCP
Match on the Class Of Service flag
12 Fragment Encoding

NOTE: Not all 12 types have to be defined for FlowSpec to be enabled

FlowSpec Actions

RFC 5575 has defined 4 minimum Actions that routes matching FlowSpec NRLI types can take. These actions are carried as BGP extended communities added to the FlowSpec route. These actions are:

Traffic-Rate Community

The Traffic-Rate community is non-transitive, that tells the receiving BGP peer, what to rate limit matching traffic to. If the traffic needs to be discarded or dropped, this will be limit of 0 should be used.

Traffic-Action Community

The Traffic-Action community is used to sample defined traffic. This allows sampling and logging metrics to be collected from the FlowSpec route, that could be used to get a better understand of the attack traffic.

Redirect Community

The Redirect community allows the FlowSpec traffic to be redirected into a Virtual Routing and Forward Instance VRF. As the same Route-Targets and Route-Distinguisher can be used, you are able to import routes into a dedicated blackhole VPN or any other VPNv4.

Traffic-Marking Community

The Traffic-Marking community is used to modify the Differentiated Service Code Point DSCP bits of a transiting IP packet to the defined value. This could be used to set to FlowSpec routes to highest discard probability, allowing traffic not to dropped/discarded until co

FlowSpec Rule Ordering

It is important to note, that unlike normal firewall filters, FlowSpec routes use a different method of ordering rules. Most firewall filters and/or ACLs use the top-down approach, where in, once the filter has a match any other rules afterward are not inspected. With FlowSpec a deterministic algorithm to order the rules is used. By comparing the left component of each FlowSpec NLRI, the algorithm will use the following details to order FlowSpec Routes:

    1. If the types differ, the lowest type is used. If the types are the same then component values within that component are compared
    2. For IP values, the lowest IP prefix is chosen. If the IP addresses are the same then most specific prefix is used
    3. For all other types, the binary string of the contents is compared to determine the order

Validation Checks

Validate checks within FlowSpec are important, because you could get into a situation where, if no validation checks are done, FlowSpec route(s) could be injected by an attacker that doesn’t own a set of prefix(es) that could blackhole traffic. Like any other unicast BGP route, the next-hop address must resolve for the route to be usable, as per the normal BGP path selection process. In addition, to a valid next-hop, RFC 5775 has defined the follow must be valid of a Flow Specification:

    1. The originator of the flow specification matches the originator of the best-match unicast route for the destination prefix embedded in the flow specification.
    2. There are no more specific unicast routes, when compared with the flow destination prefix, that have been received from a different neighbouring AS than the best-match unicast route, which has been determined in step 1

The overall goal is to confirm that the originator of the FlowSpec route is the same as the originator of the BGP unicast route, this is done by either using BGP’s AS Path attribute or if that isn’t present (in iBGP situation) then the Peering IP address is used.

FlowSpec and Junos

Configuring FlowSpec on a JunOS device is actually quite straightforward. I’m being naughty and I don’t actually have a topology set up to show the full verification ‘show command’ outputs on the cli, but when I get the time to set something up, I’ll be back to edit this post. With all that said, Let’s getting cracking :p

The scenario is that we have an attack from 172.90.87.15 on TCP port 80 to the web-server 8.9.0.1. First we will inject a FlowSpec route to discard all TCP port 80 traffic to 8.9.0.1 when the source is from 172.90.87.15. We will need to make sure that we can order the terms as per the RFC requirement, this is done under the show routing-options flow stanza:

[email protected]# show routing-options flow                       
term-order standard;

Then enable MP-BGP family flow to BGP group

[email protected]# show protocols bgp group test 
type internal;
family inet {
    unicast;
    flow

Next configure the FlowSpec Route under routing-options flow route stanza:

[edit routing-options flow route test]
[email protected]# show 
match {
    destination 8.9.0.1/32;
    source 172.90.87.15/32;
    protocol tcp;
    port 80;
}
then discard;

With these are the options available under match and then flags. You will note that they are largely the same flags that were stated in the RFC

Match FlagsThen Flags
[edit routing-options flow]
[email protected]# set route test match ?  
Possible completions:
+ apply-groups         Groups from which to inherit configuration data
+ apply-groups-except  Don't inherit configuration data from these groups
  destination          Destination prefix for this traffic flow
+ destination-port     Destination TCP/UDP port
+ dscp                 Differentiated Services (DiffServ) code point (DSCP) (0-63)
+ fragment             
+ icmp-code            ICMP message code
+ icmp-type            ICMP message type
+ packet-length        Packet length (0-65535)
+ port                 Source or destination TCP/UDP port
+ protocol             IP protocol value
  source               Source prefix for this traffic flow
+ source-port          Source TCP/UDP port
+ tcp-flags            TCP flags
[edit routing-options flow]
[email protected]# set route test then ?                          
Possible completions:
  accept               Allow traffic through
+ apply-groups         Groups from which to inherit configuration data
+ apply-groups-except  Don't inherit configuration data from these groups
  community            Name of BGP community
  discard              Discard all traffic for this flow
  next-term            Continue the filter evaluation after matching this flow
  rate-limit           Rate in bits/sec to limit the flow traffic (9600..1000000000000)
  routing-instance     Redirect to instance identified via Route Target community
  sample               Sample traffic that matches this flow

Once committed you will be able to verify Flowspec routes because they are installed into their own routing table inetflow.0 and if dedicated, VRF for FlowSpec routes and the table will be under routing-instance-name.inetflow.0. You can also check FlowSpec firewall filter by running the command show firewall filter __flowspec_default_inet__

FlowSpec TableFlowSpec Firewall Filter
[email protected]> show route table inetflow.0 extensive 

inetflow.0: 6 destinations, 6 routes (6 active, 0 holddown, 0 hidden)
8.9.0.1,172.90.87.15,proto=6,port=80/term:3 (1 entry, 1 announced)
TSI:
KRT in dfwd;
Action(s): discard,count
        *Flow   Preference: 5
                Next hop type: Fictitious
                Address: 0x94359c4
                Next-hop reference count: 6
                State: 
                Local AS: 65123 
                Age: 4:10 
                Validation State: unverified 
                Task: RT Flow
                Announcement bits (1): 0-Flow 
                AS path: I
                Communities: traffic-rate:0:0
[email protected]> show firewall filter __flowspec_default_inet__    

Filter: __flowspec_default_inet__                              
Counters:
Name                                                Bytes              Packets
8.9.0.1,172.90.87.15,proto=6,port=80                    0                    0
Share this:
Share

Layer-2 VPNs on Junos

It has been a busy few weeks trying to stay ahead of all the new work that has been coming towards myself and the team, due to the in sourcing of the core network! Lucky enough for my team, we have finally got our hands onto full end-to-end connectivity! Fun times 😀

With that being said, I’ve been given a wee project to provision a circuit for a business customer between two sites for a Proof Of Concept. As this circuit is being using as a POC (for now), it was agreed that a Layer 2 VPN (L2VPN/pseudowire) will be best suited, because a simple point-to-point connection was needed between two PEs. As we have a MPLS enabled network, it was decided that would be the easiest way to get their POC up and running quickly, as we were under a bit of a hard deadline!

For me, it was good little project, even though I know what L2VPNs were and how they work, I had never configured one myself. You see where I’m going with this now?

This post will over note how to configure L2VPN with Junos 😀

L2VPN, also known as a pseudowire, is defined in RFC4665, where they are called Virtual Private Wire Service (VPWS):

The PE devices provide a logical interconnect such that a pair of CE devices appears to be connected by a single logical Layer 2 circuit. PE devices act as Layer 2 circuit switches. Layer 2 circuits are then mapped onto tunnels in the SP network. These tunnels can either be specific to a particular VPWS, or be shared among several services. VPWS applies for all services, including Ethernet, ATM, Frame Relay, etc. Each PE device is responsible for allocating customer Layer 2 frames to the appropriate VPWS and for proper forwarding to the intended destinations.

In essence, L2VPNs are virtual point-to-point circuit that use the underlying Transport Labels (LDP/RSVP) or a statically defined MPLS path to go between two PE’s, that allows the extension of a layer 2 broadcast domain. If you need multiple sites on the same layer 2 broadcast you will need to consider Virtual Private Lan Service (VPLS) or Ethernet VPN (EVPN).

Within Junos there are 3 ways of configuring L2VPNs, two are regarded as modern way and has been rectified with RFC’s with an additional legacy method. Kompella and Martini are regarded as the industry standard, with Circuit Cross-Connect (CCC) seen as legacy:

  • Circuit Cross-Connect: The Circuit Cross-Connect style of L2VPN uses a single Outer Label, also known as the Tunnel/Transport Label, to transport L2 payload from PE to PE. CCC can ONLY use RSVP as MPLS transport, in addition each CCC connection has its own dedicated RSVP-signalled LSP associated, the transport label cannot be shared between multiple connections. LSPs are manually created on each PE to determines which circuit the frame belongs to on the other end.
  • Martini: The Martini style of L2VPN has a pair of labels before the L2 frame. The Outer label is the transport mechanism that allows the frame from egress interface from the sending PE to ingress interface of the receiving PE. The Inner label, known as the VC Label, is the label that informs the receiving PE, where the L2VPN payload should go. It is important to note that if you are using the Martini style, although either LDP or RVSP can be used MPLS transport, that LDP is used for the signalling of the VC label. So if the RSVP is used as the MPLS transport, LDP will need to be enabled on the loopback address of both PE routers. A minimum of 2 LSPs will need to be set, as MPLS LSPs are unidirectional.
  • Kompella: The Kompella style of L2VPN is similar to Martini style as both use stacked labels before the Layer 2 payload and both can use LDP, RSVP or both as Transport Label. There difference comes in that unlike Martini, Kompella uses BGP signalling as its VC Label. This means you will need to have BGP enabled network, in addition, it’s not compulsory to send static LSPs as BGP provides a mechanism for autodiscovery of new point-to-point links similar to a VPLS. Although Kompella has a more complex configuration, because of its usage of BGP signalling it is regarded as the best option for large scale deployments as it will in-conjunction with other BGP families. RFC6624 has more details on L2VPN using BGP for Auto-Discovery and Signaling

In our network, we use the Kompella style of L2VPNs. The bulk and most depth of my testing was with that method… Although I was able to get a wee bit of naughty time after to configure the other methods 🙂

The topology I’ll be working with is a simple one. I’ve a got a single MX480 broken up into 3 Logical Systems.

L2VPN Topology


The underlying IGP is IS-IS with RSVP, LDP and BGP enabled. This is a mirror, of what we have in production. With all the L2VPNs the customer facing physical interface has to be set to the correct encapsulation. For my testing, as I wont be using VLANs, Bridging or Setting a VPLS. I used ethernet-ccc and had set the logical interface to family ccc, you can find out more about the different physical encapsulations here

Interface ConfigRSVPMPLSBGPIS-ISLDP
set interfaces xe-0/1/0 enable
set interfaces xe-0/1/0 encapsulation ethernet-ccc
set interfaces xe-0/1/0 unit 0 family ccc
set protocols rsvp interface xe-1/0/0.0
set protocols rsvp interface xe-1/0/2.0
set protocols mpls explicit-null
set protocols mpls ipv6-tunneling
set protocols mpls no-decrement-ttl
set protocols mpls interface xe-1/0/0.0
set protocols mpls interface xe-1/0/2.0
set protocols bgp group Master type internal
set protocols bgp group Master local-address 192.168.2.1
set protocols bgp group Master family inet unicast
set protocols bgp group Master family inet6 unicast
set protocols bgp group Master local-as 100
set protocols bgp group Master neighbor 192.168.2.2 
set protocols bgp group Master neighbor 192.168.2.3
set protocols isis reference-bandwidth 1000g
set protocols isis level 1 disable
set protocols isis level 2 wide-metrics-only
set protocols isis interface xe-1/0/0.0 ldp-synchronization
set protocols isis interface xe-1/0/0.0 point-to-point
set protocols isis interface xe-1/0/0.0 link-protection
set protocols isis interface xe-1/0/2.0 ldp-synchronization
set protocols isis interface xe-1/0/2.0 point-to-point
set protocols isis interface xe-1/0/2.0 link-protection
set protocols isis interface xe-1/0/3.0 ldp-synchronization
set protocols isis interface xe-1/0/3.0 point-to-point
set protocols isis interface xe-1/0/3.0 link-protection
set protocols isis interface lo0.0
sset protocols ldp track-igp-metric
set protocols ldp explicit-null
set protocols ldp transport-address router-id
set protocols ldp interface xe-1/0/0.0
set protocols ldp interface xe-1/0/2.0
set protocols ldp interface lo0.0

All configurations will be done on the Master and SiteA, and for my examples I will show work done on the Master Instance. With all that out of the way… Let’s get cracking 😀

Kompella

As stated before, BGP is used as the VPN signalling method, with that in mind, we will need to enable layer-2 signalling within MP-BGP. This is simply done by adding the command family l2vpn signaling with the BGP stanza. This can be added globally within BGP or under the specific neighbour.

set protocols bgp group Master family l2vpn signaling

With the signalling sorted we can go straight into the configuration of the L2VPN. Just like L3VPNs, L2VPNs configuration is done within the routing-instance stanza and uses the same parameters as L3VPN by having Route Distinguisher (RD) and Route-Target/vrf-target (RT). The RD has to be unique per device with RT matching on all devices within the L2VPN, this is important, so that traffic can be routed accordingly per site. In addition, routing-instance has to be set to l2vpn and the interface(s) have to be defined within the routing-instance as well.

set routing-instances Master instance-type l2vpn
set routing-instances Master interface xe-0/1/0.0
set routing-instances Master route-distinguisher 100:0001
set routing-instances Master vrf-target target:100:0000

Next the properties for that site within the L2VPN will need to configured under protocol l2vpn within the routing-instance. The encapsulation has to match all site that want to participate within the VPN. The Site identifier must be unique to the entire site within the L2VPN as the site ID is used to compute label values for site-to-site communications. The interface(s) have to be defined within l2vpn and l2vpn site stanzas.

set routing-instances Master protocols l2vpn encapsulation-type ethernet
set routing-instances Master protocols l2vpn interface xe-0/1/0.0
set routing-instances Master protocols l2vpn site Master site-identifier 1
set routing-instances Master protocols l2vpn site Master interface xe-0/1/0.0
Full Kompella Configuration
set routing-instances Master instance-type l2vpn
set routing-instances Master interface xe-0/1/0.0
set routing-instances Master route-distinguisher 100:0001
set routing-instances Master vrf-target target:100:0000
set routing-instances Master protocols l2vpn encapsulation-type ethernet
set routing-instances Master protocols l2vpn interface xe-0/1/0.0
set routing-instances Master protocols l2vpn site Master site-identifier 1
set routing-instances Master protocols l2vpn site Master interface xe-0/1/0.0
set protocols bgp group Master family l2vpn signaling

Verification

The primary command that will be used to check the status of a pseudowire would be show l2vpn connections. As Komplella signalling uses BGP, we will be able to do a show bgp summary and see a route being advertised within the l2vpn and routing instance tables show route table Master.l2vpn.0 or show route table bgp.l2vpn.0 respectfully. Additionally we will be able to mpls.0 table to confirm that the L2VPN incoming label and interface(s) for the pseudowire have made the routing table, by using show route table mpls.0.

Show l2vpn Connectionsshow bgp summaryshow route table Master.l2vpn.0show route table mpls.0
[email protected]> show l2vpn connections    
Layer-2 VPN connections:

Legend for connection status (St)   
EI -- encapsulation invalid      NC -- interface encapsulation not CCC/TCC/VPLS
EM -- encapsulation mismatch     WE -- interface and instance encaps not same
VC-Dn -- Virtual circuit down    NP -- interface hardware not present 
CM -- control-word mismatch      -> -- only outbound connection is up
CN -- circuit not provisioned    <- -- only inbound connection is up
OR -- out of range               Up -- operational
OL -- no outgoing label          Dn -- down                      
LD -- local site signaled down   CF -- call admission control failure      
RD -- remote site signaled down  SC -- local and remote site ID collision
LN -- local site not designated  LM -- local site ID not minimum designated
RN -- remote site not designated RM -- remote site ID not minimum designated
XX -- unknown connection status  IL -- no incoming label
MM -- MTU mismatch               MI -- Mesh-Group ID not available
BK -- Backup connection	         ST -- Standby connection
PF -- Profile parse failure      PB -- Profile busy
RS -- remote site standby	 SN -- Static Neighbor
LB -- Local site not best-site   RB -- Remote site not best-site
VM -- VLAN ID mismatch

Legend for interface status 
Up -- operational           
Dn -- down

Instance: Master
  Local site: Master (1)
    connection-site           Type  St     Time last up          # Up trans
    2                         rmt   Up     Jun  4 12:36:46 2016           2
      Remote PE: 192.168.2.2, Negotiated control-word: Yes (Null)
      Incoming label: 800001, Outgoing label: 800000
      Local interface: xe-0/1/0.0, Status: Up, Encapsulation: ETHERNET

[email protected]> show bgp summary 
Groups: 1 Peers: 2 Down peers: 0
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
inet.0               
                       0          0          0          0          0          0
inet6.0              
                       0          0          0          0          0          0
bgp.l2vpn.0          
                       1          1          0          0          0          0
Peer                     AS      InPkt     OutPkt    OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
192.168.2.2             100       3234       3229       0       1  1d 0:19:17 Establ
  inet.0: 0/0/0/0
  inet6.0: 0/0/0/0
  Master.l2vpn.0: 1/1/1/0
  bgp.l2vpn.0: 1/1/1/0
192.168.2.3             100       5735       5724       0       1 1d 19:06:59 Establ
  inet.0: 0/0/0/0
  inet6.0: 0/0/0/0
  Master.l2vpn.0: 0/0/0/0
  bgp.l2vpn.0: 0/0/0/0/

[email protected]> show route table Master.l2vpn.0 

Master.l2vpn.0: 2 destinations, 2 routes (2 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100:1:1:1/96                
                   *[L2VPN/170/-101] 1d 20:37:12, metric2 1
                      Indirect
100:2:2:1/96                
                   *[BGP/170] 00:01:02, localpref 100, from 192.168.2.2
                      AS path: I, validation-state: unverified
                    > to 192.168.1.14 via xe-1/0/0.0, Push 0
                      to 192.168.1.6 via xe-1/0/2.0, Push 300000

[email protected]> show route table mpls.0 protocol l2vpn    

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

800001             *[L2VPN/7] 23:30:09
                    > via xe-0/1/0.0, Pop       Offset: 4
xe-0/1/0.0         *[L2VPN/7] 00:06:20, metric2 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 800000 Offset: 252
                      to 192.168.1.6 via xe-1/0/2.0, Push 800000, Push 300000(top) Offset: 252

From the end host point of view, we have end-to-end connectivity 😀

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.431/0.637/0.843/0.206 ms
Note
The route given from the show route table Master.l2vpn.0 is the Route Distinguisher of the other end of the pseudowire

Martini

Martini signalling uses LDP, as stated before, and with LDP enabled already, I will focus on the actual configuration, which is done within the protocol l2circuit stanza. Compared to Kompella, the configuration for Martini style of L2VPNs is much simpler. All that is needed is for:

  • The remote neighbour to be defined. In my example I will be using the loopback address SiteA as the remote neighbour
  • The customer facing interface connecting into the VPN
  • Set a circuit ID, that must match on both sides

All this can be done in one line!

set protocols l2circuit neighbor 192.168.2.2 interface xe-0/1/0.0 virtual-circuit-id 1

With that we have Martini style L2VPN configured 🙂

Verifications

To check the status of Martini style L2VPN, you will use show l2circuit connections, the output is near enough the same as show l2vpn connections. Martini, as discussed above, uses LDP for the signalling, we will be able to use show ldp neighbor to check that the neighbour relationship with the remote side has been successful and we will be able to check the LDP database by using show ldp database to verify that new labels associated with the pseudowire (L2CKT) has been installed into the database. Additionally you can check the inet.3 and mpls.0 routing tables, by using show route table inet.3 & show route table mpls.0

Show l2circuit Connectionsshow ldp neighborshow ldp databaseshow route table inet.3show route table mpls.0
[email protected]> show l2circuit connections 
Layer-2 Circuit Connections:

Legend for connection status (St)   
EI -- encapsulation invalid      NP -- interface h/w not present   
MM -- mtu mismatch               Dn -- down                       
EM -- encapsulation mismatch     VC-Dn -- Virtual circuit Down    
CM -- control-word mismatch      Up -- operational                
VM -- vlan id mismatch		 CF -- Call admission control failure
OL -- no outgoing label          IB -- TDM incompatible bitrate 
NC -- intf encaps not CCC/TCC    TM -- TDM misconfiguration 
BK -- Backup Connection          ST -- Standby Connection
CB -- rcvd cell-bundle size bad  SP -- Static Pseudowire
LD -- local site signaled down   RS -- remote site standby
RD -- remote site signaled down  HS -- Hot-standby Connection
XX -- unknown

Legend for interface status  
Up -- operational            
Dn -- down                   
Neighbor: 192.168.2.2 
    Interface                 Type  St     Time last up          # Up trans
    xe-0/1/0.0(vc 1)          rmt   Up     Jun  5 14:03:37 2016           1
      Remote PE: 192.168.2.2, Negotiated control-word: Yes (Null)
      Incoming label: 300000, Outgoing label: 300016
      Negotiated PW status TLV: No
      Local interface: xe-0/1/0.0, Status: Up, Encapsulation: ETHERNET
      Flow Label Transmit: No, Flow Label Receive: No

[email protected]> show ldp neighbor      
Address            Interface          Label space ID         Hold time
192.168.2.2        lo0.0              192.168.2.2:0            43
192.168.1.6        xe-1/0/2.0         192.168.2.3:0            14
192.168.1.14       xe-1/0/0.0         192.168.2.2:0            13

[email protected]> show ldp database                             
Input label database, 192.168.2.1:0--192.168.2.2:0
  Label     Prefix
 299984      192.168.2.1/32
      0      192.168.2.2/32
 300000      192.168.2.3/32
 300016      L2CKT CtrlWord ETHERNET VC 1

Output label database, 192.168.2.1:0--192.168.2.2:0
  Label     Prefix
      0      192.168.2.1/32
 299968      192.168.2.2/32
 299984      192.168.2.3/32
 300000      L2CKT CtrlWord ETHERNET VC 1

Input label database, 192.168.2.1:0--192.168.2.3:0
  Label     Prefix
 300016      192.168.2.1/32
 300000      192.168.2.2/32
      0      192.168.2.3/32

Output label database, 192.168.2.1:0--192.168.2.3:0
  Label     Prefix
      0      192.168.2.1/32
 299968      192.168.2.2/32
 299984      192.168.2.3/32

[email protected]> show route table inet.3 192.168.2.2 

inet.3: 3 destinations, 4 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

192.168.2.2/32     *[LDP/9] 1d 21:16:06, metric 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 0
                      to 192.168.1.6 via xe-1/0/2.0, Push 300000
                    [RSVP/10/1] 1d 01:14:11, metric 100
                    > to 192.168.1.6 via xe-1/0/2.0, label-switched-path to-siteA

[email protected]> show route table mpls.0 protocol l2circuit 

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

300000             *[L2CKT/7] 00:05:13
                    > via xe-0/1/0.0, Pop       Offset: 4
xe-0/1/0.0         *[L2CKT/7] 00:05:13, metric2 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 300016 Offset: 252
                      to 192.168.1.6 via xe-1/0/2.0, Push 300016, Push 300000(top) Offset: 252

From the end host point of view, connectivity between the two is there 🙂

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.358/0.532/0.707/0.176 ms

Circuit Cross-Connect

As CCC doesn’t support stacked labels unlike Kompella and Martini, we will need to configure 2 static LSPs between the PE routers. CCC needs to have a LSP for to transmit and another to receive traffic. So firstly, we will need to get the LSPs configured. The received LSP will be configured on the remote PE, so under protocols mpls label-switched-path stanza, this is where we will define the LSP. I've used the loopback address of the remote end with the underlying IGP working out the best path.

set protocols mpls label-switched-path to-siteA to 192.168.2.2
set protocols mpls label-switched-path to-siteA no-cspf

With the LSPs configured, we will need to go under the protocol connections stanza. We need to define the customer facing interface(s) that will be connecting into the VPN, then set the transmit LSP and receive LSP, this will be the name of the LSP set on the remote end.

set protocols connections remote-interface-switch siteA interface xe-0/1/0.0
set protocols connections remote-interface-switch siteA transmit-lsp to-siteA
set protocols connections remote-interface-switch siteA receive-lsp to-Master

With that we are sorted!

Verifications

In regards with CCC there's less show commands, from what I’ve found (let me know if there's more please), but we can check the pseudowire's status by using show connections. We can confirm the Transmit (Ingress) and Receive (Egress) LSP using show mpls lsp and finally, we will be able to mpls.0 table to confirm that the L2VPN incoming label and interface(s) for the pseudowire have made the routing table, by using show route table mpls.0.

Show Connectionsshow mpls lspshow route table mpls.0
[email protected]> show connections 
CCC and TCC connections [Link Monitoring On]
Legend for status (St):             Legend for connection types:
 UN -- uninitialized                 if-sw:  interface switching
 NP -- not present                   rmt-if: remote interface switching
 WE -- wrong encapsulation           lsp-sw: LSP switching
 DS -- disabled                      tx-p2mp-sw: transmit P2MP switching
 Dn -- down                          rx-p2mp-sw: receive P2MP switching
 -> -- only outbound conn is up     Legend for circuit types:
 <- -- only inbound  conn is up      intf -- interface
 Up -- operational                   oif  -- outgoing interface
 RmtDn -- remote CCC down            tlsp -- transmit LSP
 Restart -- restarting               rlsp -- receive LSP


Connection/Circuit                Type        St      Time last up     # Up trans
siteA                             rmt-if      Up      Jun  3 12:42:55           1
  xe-0/1/0.0                        intf  Up
  to-siteA                          tlsp  Up
  to-Master                         rlsp  Up

[email protected]> show mpls lsp                           
Ingress LSP: 1 sessions
To              From            State Rt P     ActivePath       LSPname
192.168.2.2     192.168.2.1     Up     0 *     to-siteA         to-siteA
Total 1 displayed, Up 1, Down 0

Egress LSP: 1 sessions
To              From            State   Rt Style Labelin Labelout LSPname 
192.168.2.1     192.168.2.2     Up       0  1 FF  300080        - to-Master
Total 1 displayed, Up 1, Down 0

Transit LSP: 0 sessions
Total 0 displayed, Up 0, Down 0

[email protected]> show route table mpls.0 protocol ccc    

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

300080             *[CCC/7] 00:00:04
                    > via xe-0/1/0.0, Pop      
xe-0/1/0.0         *[CCC/10/1] 00:00:04, metric 100
                    > to 192.168.1.14 via xe-1/0/0.0, label-switched-path to-siteA

Finally to confirm end-to-end reachability between the end hosts

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.358/0.532/0.707/0.176 ms

I had planned to have a wee bit more to this post, with what I was actually testing ,however, this is getting a bit longer than I expected, so I'll make this into a two-part 😉

My next post will detail, how you can use traffic engineering to manipulate a L2VPN path between 2 PE routers! Hope to see you there 😀

References

Darren's Blog L2VPN in Junos
RFC4665
MPLS l2VPN
RFC6624
RFC6074
Vlan based CCC L2vpn

Share this:
Share

Configuring TACACS+ Server on Ubuntu 14.04LTS

It’s all change in the office so far this year, which is quite good as I’m involved in more projects, and who doesn’t enjoy a few projects 😉

The latest thing I was asked to look into was to create a new TACACS+ server as our current server on a HP Proliant BL460c G1 Blade is going to be decommissioned so we need to give it a new home! It was decided that it should be virtualized as there isn’t a need to have a physical server for something that can be slimmed down dramatically. With that being said this post will go over how to configure a TACACS+ server and configure TACACS+ authentication on a Juniper device.

TACACS+ is an improvement on its first version TACACS, as TACACS+ is an entirely new protocol and is not compatible with its predecessors, TACACS and XTACACS. TACACS+ uses TCP. Since TACACS+ uses the authentication, authorisation, and accounting (AAA) architecture, these separate components of the protocol can be segregated and handled on separate servers. TACACS+ allows you to set granular access policies for users and groups, commands, location, subnet, or even device type. The TACACS+ protocol also provides detailed logging of users and what commands have been run on specific devices. In addition, the protocol can run on either Windows or UNIX/Linux.

Although TACACS+ was developed by Cisco Systems, it is actually an open standard as defined by RFC1482 and has been incorporated into a number of different vendors including Alcatel/Lucent, Arbor, Brocade/Foundry, Cisco/Linksys, Extreme, HP/3Com, Huawei, IBM, Juniper/Netscreen, Netgear and any others.

The setup I had for testing was a simple one; I had 2 EXSi Ubuntu 14.04LTS hosts, one as the TACACS+ server with the second being used as Jump-box to access a Juniper SRX220 that will be configured for TACACS authentication.

With all that talk out of the way, let’s get cracking 🙂

You will run sudo/root privileges

Server Configuration

Fortunately, with the newer version of Ubuntu, from apt-get repository you can easily download the tacacs+ package it will also install libtacacs+1

[email protected]:~$ sudo apt-get install tacacs+
[sudo] password for marquk01: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  libtacacs+1
The following NEW packages will be installed
  libtacacs+1 tacacs+

Having installed the package now we can run the command ps -ef | grep tac_plus and it will show us the location of the configuration file and if the process is running:

[email protected]:~$ ps -ef | grep tac_plus
root      1220     1  0 11:37 ?        00:00:00 /usr/sbin/tac_plus -C /etc/tacacs+/tac_plus.conf
marquk01 22730  2682  0 13:55 pts/0    00:00:00 grep --color=auto tac_plus

As the process is running there’s a few useful binary files that are important to know, these can be seen when you type tac and hit TAB.

[email protected]:~$ tac
tac  tac_plus  tac_pwd

The important files are tac_plus and tac_pwd:

  • tac_plus is the TACACS+ daemon. You can run daemon via the cli
  • tac_pwd is used to generate a Data Encryption Standard (DES) or Message-Digest 5 (MD5) hash from clear text. DES is the defualt, to generate a MD5 hash you need to add -m flag.

We will need to configure the tac_plus.conf file, but firstly we will need to back-up the original file to refer back to if there is any issues

[email protected]:~$ sudo cp /etc/tacacs+/tac_plus.conf /etc/tacacs+/tac_plus.conf.old

I’ll explain from top-down of what my file looks like. The default file has more parameters than I used, as my file doesn’t need too much complexity. My example will also show you how to configure the basis Accounting, Secret Key, Users and Groups. Logically when I look at the layout of the file as I have, it doesn’t make sense… However, all the information is there soooooo it doesn’t matter :p lol

Accounting

Firstly we’ll need to set the file that the accounting information will be written to. By default this is /var/log/tac_plus.acct, however you can have this file where you like if you don’t want you use the default file and path.

You have to create this file yourself. This can be done by running the command sudo touch /var/log/tac_plus.acct

# Created by Henry-Nicolas Tourneur([email protected])
# See man(5) tac_plus.conf for more details

# Define where to log accounting data, this is the default.

accounting file = /var/log/tac_plus.acct

Secret Key

The Server and Client need to have a matching key so the AAA packets can be encrypted. This key can be anything you wish however, if you’re going to have a key with white-space, key-words, or special characters, you’ll need to use quotation marks

# This is the key that clients have to use to access Tacacs+

key = testing123

Users

You’ll need to define the users that will have access to the device. Each user needs to be associated to a group and have their password defined. The password has to be set as either a MD5 or DES hash. By using tac_pwd use can get your hashed output:

[email protected]:~$ tac_pwd
Password to be encrypted: lab123
kBeC6JDjU8icY

There is an additional stanza service = junos-exec that defines an additional group. This is Juniper specific and I’ll explain this later. I created two users kmarquis; will have permission to do anything and second usertest; that will only have Read-Only access. Both have the same password. Usernames ARE case sensitive.

# We also can define local users and specify a file where data is stored.
# That file may be filled using tac_pwd
user = kmarquis {
    name = "Keeran Marquis"
    member = admin
    login = des kBeC6JDjU8icY
		service = junos-exec {
			local-user-name = remote-admin
	}
}

user = test {
    name = "Test User"
    member = read-only
    login =  des kBeC6JDjU8icY
        service = junos-exec {
            local-user-name = remote-read-only
               }
}

Groups

As you can guess, groups are where you define the level of access and what commands will be used by the group. The commands, for my example, are used to define actions that are largely accepted by most vendors with the expectation of Juniper (from my knowledge but correct me if I’m wrong), although I wont be confirming the configuration works in this post. I have checked with a Cisco device and they worked as expected.

We have a few parameters that are important remember:

  • default service: defines the default permission that the user will have. By default, if this statement isn’t used or left blank, it’s denied. Meaning that each permitted command users of this group will have to be listed. If you want the default permission to allow, then the statement permit is needed
  • service: define services which the group is authorised to execute, these could be commands that the group is authorised to execute. Authorisation must be configured on both the client and the daemon to operate correctly.
  • cmd: This is where you list a command and set an action, it will be either be a permit or deny. Additionally by having the .* this means that any command after the first word is affected. i.e my example below, all show commands will be permitted

In my example I have two groups, admin and read-only, the admin group will have full access permitted and the read-only group, as the name suggests, will have read-only access and will be denied from any configuration, clear or restart commands.

# We can also specify rules valid per group of users.
group = admin {
	default service = permit
	service = exec {
		priv-lvl = 15
		}
	}

group = read-only {
	service = exec {
		priv-lvl = 15
		}
	cmd = show {
		permit .*
		}
	cmd = write {
		permit term
		}
	cmd = dir {
		permit .*
		}
	cmd = admin {
		permit .*
		}
	cmd = terminal {
		permit .*
		}
	cmd = more {
		permit .*
		}
	cmd = exit {
		permit .*
		}
	cmd = logout {
		permit .*
		}
}

My completed tac_plus file can be seen here.

Note
For more in-depth detail and additional parameters that can be configured in this file, you can find them via the man pages using the command man tac_plus or online Ubuntu tac_plus Manual Documentation

Once you’re happy with everything you can run service tacacs_plus check to make sure the syntax is correct and if you get any errors you will need to restart the daemon using service tacacs_plus restart

TACACS+ Daemon Commands
Additional commands that will be useful to remember:

service tacacs_plus check
service tacacs_plus status
service tacacs_plus stop
service tacacs_plus start
service tacacs_plus restart

With that we have a TACACS+ server configured 🙂

Before getting into the configuration of the SRX, I stated earlier that there’s a Juniper Specific stanza in tac_plus.conf file. When authenticating users against a TACACS+ server on juniper devices and you’ll need to apply Juniper Networks Vendor-Specific TACACS+ Attributes.

These attributes can be either:

  1. Specified in the tac_plus.conf file by using regular expressions to list all the commands that the user has permitted or denied. A user will need to be created on the device with that user being referred under the local-user-name statement. The stanza would look something:
    service = junos-exec {
    	local-user-name = xxx
    	allow-commands =  .*
    	allow-configurations = .*
    	deny-commands = 
    	deny-configuration = 
    	user-permissions = 
    	}
  2. Configure a class that has states all the permitted or denied permissions, this class will be linked to a user. Both need to be configured on the device. Once this has been created you’ll need to refer, said user, under the local-user-name

The Junos OS retrieves these attributes through an authorization request of the TACACS+ server after authenticating a user. For my example, I went with the latter. Now we’ll jump onto the SRX220 and get that sorted with TACACS+ AAA configuration.

Juniper Configuration

Firstly, you will have to set the TACACS+ server with its secret key. For standard practice and force of habit, I have set the single connection and forced the source-address of the SRX. By using the single connection statement, this means that instead of multiple TCP sessions connecting to the device from a server, a single session is maintained between them. In addition, for best practice an authentication order should be set so that if there was an issue or loss of connectivity to the TACACS+ server, you’ll be able to fall back to locally defined users.

authentication-order [ tacplus password ];
tacplus-server {
    10.1.0.148 {
        secret "$9$SszyMXVb2aGiYgi.fzCAIEcyvWX7-w24"; ## SECRET-DATA
        single-connection;
        source-address 10.1.0.158;
    }
}

With the TACACS+ server we’re able log different events that take place on the device and get those commands sent to the server. From my experience the accounting events that you would most want logged are logins, configuration changes and interactive commands. This is set under system accounting stanza

accounting {
    events [ login change-log interactive-commands ];
    destination {
        tacplus;
    }
}

Next, under the system login stanza, you need to create a class that has a list of permission available to the user(s) that are going to be associated to it. The user(s) are what are used in the tac_plus.conf file. In my example I created two classes, one with all permission super-user-local and the other user with read-only and basic troubleshooting options (ie ping, traceroute, telnet etc) read-only-user-local. These associated this classes with 2 users remote-admin and remote-read-only

login {
    class read-only-user-local {
        permissions [ network view view-configuration ];
    }
    class super-user-local {            
        permissions all;
    }
    user remote {
        full-name "TACACS User";
        uid 2001;
        class super-user-local;
    }
    user remote-read-only {
        full-name "TACACS read-only user";
        uid 2002;
        class read-only-user-local;
    }
}
NOTE
You can learn more about the different permissions flags available here on Juniper TechLibrary

Verifications

To confirm the configuration is working as expected, I will ssh onto the SRX220 with both the admin user kmarquis and the read-only user test. With both users, I will log in and try to configure the description This is a test on a random port. As you can see below I had no problem with user kmarquis. However, when I logged in with the test user I wasn’t able to enter the configuration mode as the permission wasn’t granted, and for that user the command isn’t even recognized. I ran a show command and you will see that none of the passwords are shown. Again this is due to the permission level granted.

Admin AccessRead Only Access
[email protected]:~$ ssh 10.1.0.158 -l kmarquis
Password: 
--- JUNOS 12.1X47-D30.4 built 2015-11-13 14:16:02 UTC
[email protected]> configure 
Entering configuration mode
[edit]
[email protected]# set interfaces ge-0/0/5 description "This is a test" 

[edit]
[email protected]# commit and-quit 

[email protected]>
[email protected]:~$ ssh 10.1.0.158 -l test
Password: 
--- JUNOS 12.1X47-D30.4 built 2015-11-13 14:16:02 UTC
[email protected]> configure
                 ^
unknown command.

[email protected]> show configuration 
## Last commit: 2016-02-01 12:56:23 UTC by kmarquis
version 12.1X47-D30.4;
system {
    host-name v6-testing;
    authentication-order [ tacplus password ];
    root-authentication {
        encrypted-password /* SECRET-DATA */; ## SECRET-DATA
    }

If we check the /var/log/tac_plus.acct file we’ll be able to see all the permitted commands by each user. This is additional confirmation that the users have successfully authenticated against the TACACS+ server and their related permissions authorised to the device.

Feb  1 12:55:38 10.1.0.158      kmarquis        ttyp0   10.1.0.137      start   task_id=1       service=shell   process*mgd[38808]      cmd=login
Feb  1 12:55:41 10.1.0.158      kmarquis        ttyp0   10.1.0.137      stop    task_id=2       service=shell   process*mgd[38808]      cmd=show configuration 
Feb  1 12:55:44 10.1.0.158      kmarquis        ttyp0   10.1.0.137      stop    task_id=3       service=shell   process*mgd[38808]      cmd=edit 
Feb  1 12:56:01 10.1.0.158      kmarquis        ttyp0   10.1.0.137      stop    task_id=4       service=shell   process*mgd[38808]      cmd=set: [interfaces ge-0/0/5 de$
Feb  1 12:56:01 10.1.0.158      kmarquis        ttyp0   10.1.0.137      stop    task_id=5       service=shell   process*mgd[38808]      cmd=set interfaces ge-0/0/5 desc$
Feb  1 12:56:05 10.1.0.158      kmarquis        ttyp0   10.1.0.137      stop    task_id=6       service=shell   process*mgd[38808]      cmd=commit and-quit 
Feb  1 12:56:27 10.1.0.158      kmarquis        ttyp0   10.1.0.137      stop    task_id=7       service=shell   process*mgd[38808]      cmd=exit 
Feb  1 12:56:27 10.1.0.158      kmarquis        ttyp0   10.1.0.137      stop    task_id=1       service=shell   elapsed_time=49 process*mgd[38808]      cmd=logout
Feb  1 12:56:34 10.1.0.158      test    ttyp0   10.1.0.137      start   task_id=1       service=shell   process*mgd[38845]      cmd=login
Feb  1 12:56:44 10.1.0.158      test    ttyp0   10.1.0.137      stop    task_id=2       service=shell   process*mgd[38845]      cmd=show configuration 
Feb  1 12:56:53 10.1.0.158      test    ttyp0   10.1.0.137      stop    task_id=3       service=shell   process*mgd[38845]      cmd=show system uptime 
Feb  1 12:56:56 10.1.0.158      test    ttyp0   10.1.0.137      stop    task_id=4       service=shell   process*mgd[38845]      cmd=exit 
Feb  1 12:56:56 10.1.0.158      test    ttyp0   10.1.0.137      stop    task_id=1       service=shell   elapsed_time=22 process*mgd[38845]      cmd=logout

And with that all, we have a fully configured and working AAA TACACS+ server 🙂

Extra Treat 🙂
I have included the set commands below:

set system tacplus-server 10.1.0.148 secret "$9$SszyMXVb2aGiYgi.fzCAIEcyvWX7-w24"
set system tacplus-server 10.1.0.148 single-connection
set system tacplus-server 10.1.0.148 source-address 10.1.0.158

set system authentication-order tacplus
set system authentication-order password

set system accounting events login
set system accounting events change-log
set system accounting events interactive-commands
set system accounting destination tacplus

set system login class super-user-local permissions all
set system login class read-only-user-local permissions network
set system login class read-only-user-local permissions view
set system login class read-only-user-local permissions view-configuration

set system login user remote-read-only full-name "TACACS read-only user"
set system login user remote-read-only uid 2005
set system login user remote-read-only class read-only-user-local
set system login user remote-admin full-name "TACACS User"
set system login user remote-admin uid 2006
set system login user remote-admin class super-user-local
Extra Extra Treat 😀
P.S. If you want to see what configuration could be used on a Cisco device I have added it below. Although I didn’t test it myself, this is the config we have in production and it works :p

aaa new-model
aaa authentication login default group tacacs+ local enable
aaa authorization exec default group tacacs+ local none 
aaa authorization commands 0 default group tacacs+ local none 
aaa authorization commands 1 default group tacacs+ local none 
aaa authorization commands 15 default group tacacs+ local none 
aaa accounting exec default start-stop group tacacs+
aaa accounting commands 0 default start-stop group tacacs+
aaa accounting commands 1 default start-stop group tacacs+
aaa accounting commands 15 default start-stop group tacacs+
aaa session-id common

Reference

Configure TACACS+ Ubuntu 14.04LTS
TACACS+ Accounting
TACACS+ Authenication
TACACS+ Advantages

Share this:
Share

Upgrading Dual Routing Engine Juniper MX Series

In one of my previous post, I explained how you would go about upgrading a Juniper EX switch. I said whenever I got the chance to upgrade a MX Series Router, I’ll get something noted down…… *raises hands* today is the day! As I’ve said in a few posts, there has been a lot of change and now team is now getting access to the Core Juniper MX Series Routers. As part of this increased access, one of our first tasks is it upgrade JunOS from 12.3R5.7 to 14.1R6.4. With most, if not, all MX Series above the MX80, they will come with two Routing Engines (RE), and both are independent of each other. This is being the case, when upgrading a MX; you will need to upgrade each RE by individually.

This post will go over what you will need to do upgrade an MX Router, in my setup I’ll be upgrading a Juniper MX480 Router and I’ll be doing the upgrade via the console port on each Routing Engine.

To link two Routing Engines together, you would need to apply similar configuration to what I used:

set groups re0 system host-name re0-mx480
set groups re0 interfaces fxp0 unit 0 family inet address x.x.x.x/x
set groups re1 system host-name re1-mx480
set groups re1 interfaces fxp0 unit 0 family inet address x.x.x.x/x
set apply-groups re1
set apply-groups re0

set chassis redundancy graceful-switchover
set routing-options nonstop-routing
set system commit synchronize

With that all cleared up.. Let’s get cracking 🙂

Pre Works

Upload the new firmware version to wherever you normally keep it them. Currently, we would normally upload the package into the /var/tmp folder on the device in question

[[email protected] ~]$ scp jinstall-14.3R6.4-domestic-signed.tgz re0-mx480:/var/tmp

After just saying how you to link the two REs together, for an upgrade, you will need to disable graceful-switchover and nonstop-routing. Skipping this step can potentially result in the control plane and forwarding plane having two different JUNOS versions, which can cause a number of potential issues!

deactivate chassis redundancy graceful-switchover
deactivate routing-options nonstop-routing

Upgrade Process

Having disabled both graceful-switchover and nonstop-routing, log onto the Backup RE, either by console or from the Master RE run the command request routing-engine login re1. Once on the Backup RE, you will need to run the command request system software validate add /var/tmp/xxx reboot.

[email protected]> request system software validate add /var/tmp/jinstall-14.1R6.4-domestic-signed.tgz reboot
NOTE
If you’re like us and save the new firmware package to the local device, when you run the software add command DO NOT set what RE has the package stored. If you do you add the package’s location once the upgrade is completed, on one of the RE, it will delete the image from the device!
Additionally you had requested a session from re0 to re1 to connect to Backup RE, once the RE reboots, you get this message and get booted off

[email protected]>                                                                                
*** FINAL System shutdown message from [email protected] ***                 

System going down IMMEDIATELY                                                  

                                                                               
rlogin: connection closed

If you have console access, you can watch the upgrade ticking along, if you don’t, you can confirm the Backup RE is up and running by using the command show chassis routing-engine, it will show the status and hardware stats for both Routing Engines.

show chassis routing-engine output
[email protected]> show chassis routing-engine    
Routing Engine status:
  Slot 0:
    Current state                  Master
    Election priority              Master (default)
    Temperature                 31 degrees C / 87 degrees F
    CPU temperature             37 degrees C / 98 degrees F
    DRAM                      3584 MB (3584 MB installed)
    Memory utilization          20 percent
    CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     4 percent
      Interrupt                  0 percent
      Idle                      96 percent
    Model                          RE-S-2000
    Serial ID                      9012021718
    Start time                     2016-03-22 13:14:37 GMT
    Uptime                         3 hours, 38 minutes, 16 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.01       0.01       0.00
Routing Engine status:
  Slot 1:
    Current state                  Backup
    Election priority              Backup (default)
    Temperature                 33 degrees C / 91 degrees F
    CPU temperature             38 degrees C / 100 degrees F
    DRAM                      3584 MB (4096 MB installed)
    Memory utilization          16 percent
    CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     0 percent
      Interrupt                  0 percent
      Idle                     100 percent
    Model                          RE-S-2000
    Serial ID                      9012022174
    Start time                     2016-03-22 16:47:56 GMT
    Uptime                         4 minutes, 45 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.34       0.47       0.23

Having upgraded Backup RE, to reduce the downtime and service impact, you can failover the Master Routing Engine, so that the Backup becomes the new Master Routing Engine. This is an manual process by running the command, from the current Master RE, request chassis routing-engine master switch. This WILL cause a brief outage as the PFE is reset and the new firmware is loaded.

[email protected]> request chassis routing-engine master switch    
warning: Traffic will be interrupted while the PFE is re-initialized
Toggle mastership between routing engines ? [yes,no] (no) yes 

Resolving mastership...
Complete. The other routing engine becomes the master.

You can confirm by running show chassis routing-engine again on RE0

AFTER failing over Routing Engine
[email protected]> show chassis routing-engine 
Routing Engine status:
  Slot 0:
    Current state                  Backup
    Election priority              Master (default)
    Temperature                 32 degrees C / 89 degrees F
    CPU temperature             39 degrees C / 102 degrees F
    DRAM                      3584 MB (3584 MB installed)
    Memory utilization          16 percent
    CPU utilization:
      User                       2 percent
      Background                 0 percent
      Kernel                     1 percent
      Interrupt                  0 percent
      Idle                      97 percent
    Model                          RE-S-2000
    Serial ID                      9012021718
    Start time                     2016-03-22 13:14:37 GMT
    Uptime                         3 hours, 50 minutes, 7 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.21       0.07       0.02
Routing Engine status:
  Slot 1:
    Current state                  Master
    Election priority              Backup (default)
    Temperature                 33 degrees C / 91 degrees F
    CPU temperature             41 degrees C / 105 degrees F
    DRAM                      3584 MB (4096 MB installed)
    Memory utilization          22 percent
    CPU utilization:
      User                      43 percent
      Background                 0 percent
      Kernel                    28 percent
      Interrupt                  0 percent
      Idle                      29 percent
    Model                          RE-S-2000
    Serial ID                      9012022174
    Start time                     2016-03-22 16:47:56 GMT
    Uptime                         16 minutes, 42 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       4.71       1.11       0.46

Having failed over the RE now, all that needed is to repeat the same command as before request system software validate add /var/tmp/xxx reboot to install the new firmware on RE0

[email protected]> request system software add /var/tmp/jinstall-14.1R6.4-domestic-signed.tgz reboot

Once the Routing-Engine has upgraded you need to re-enable graceful-switchover and non-routing, first, you will see why in moment.

activate chassis redundancy graceful-switchover
activate routing-options nonstop-routing

After commit synchronising those changes, you will need to set RE0 back to the Master Routing Engine. This will be done by running request chassis routing-engine master switch from RE1 now.

[email protected]>request chassis routing-engine master switch 
Toggle mastership between routing engines ? [yes,no] (no) yes 

Resolving mastership...
Complete. The other routing engine becomes the master.

{backup}
[email protected]> 

Now that Graceful Switchover is enable when you run the command you wont see the same warning about traffic being disrupted, this is because Graceful Switchover preserves interface and kernel information allowing the PFE to continue forwarding packets, even though one of the REs is unavailable.

NOTE
More detail Graceful Switchover can be found on Juniper’s TechLibrary

We will need to save a backup of the currently running and active file system by issuing the command request system snapshot on both primary as well as backup REs

[email protected]> request system snapshot 
Doing the initial labeling...
Verifying compatibility of destination media partitions...
Running newfs (899MB) on hard-disk media  / partition (ad2s1a)...
Running newfs (100MB) on hard-disk media  /config partition (ad2s1e)...
Copying '/dev/ad0s1a' to '/dev/ad2s1a' .. (this may take a few minutes)
Copying '/dev/ad0s1e' to '/dev/ad2s1e' .. (this may take a few minutes)
The following filesystems were archived: / /config

Finally, confirm the code version by running show version invoke-on all-routing-engine. I’ve used the commands show version and show version invoke-on other-routing-engine just because I can put the two outputs into a table-like thing and it looks neater in a post :p

show version RE0show version RE1
{master}
[email protected]> show version          
Hostname: RE0-MX480-02
Model: mx480
Junos: 14.1R6.4
JUNOS Base OS boot [14.1R6.4]
JUNOS Base OS Software Suite [14.1R6.4]
JUNOS Packet Forwarding Engine Support (M/T/EX Common) [14.1R6.4]
JUNOS Packet Forwarding Engine Support (MX Common) [14.1R6.4]
JUNOS platform Software Suite [14.1R6.4]
JUNOS Runtime Software Suite [14.1R6.4]
JUNOS Online Documentation [14.1R6.4]
JUNOS Services AACL Container package [14.1R6.4]
JUNOS AppId Services [14.1R6.4]
JUNOS Services Application Level Gateways [14.1R6.4]
JUNOS Services Captive Portal and Content Delivery Container package [14.1R6.4]
JUNOS Border Gateway Function package [14.1R6.4]
JUNOS Services HTTP Content Management package [14.1R6.4]
JUNOS IDP Services [14.1R6.4]
JUNOS Services LL-PDF Container package [14.1R6.4]
JUNOS Services Jflow Container package [14.1R6.4]
JUNOS Services MobileNext Software package [14.1R6.4]
JUNOS Services Mobile Subscriber Service Container package [14.1R6.4]
JUNOS Services PTSP Container package [14.1R6.4]
JUNOS Services NAT [14.1R6.4]
JUNOS Services RPM [14.1R6.4]           
JUNOS Services Stateful Firewall [14.1R6.4]
JUNOS Voice Services Container package [14.1R6.4]
JUNOS Services Crypto [14.1R6.4]
JUNOS Services SSL [14.1R6.4]
JUNOS Services IPSec [14.1R6.4]
JUNOS py-base-i386 [14.1R6.4]
JUNOS Kernel Software Suite [14.1R6.4]
JUNOS Crypto Software Suite [14.1R6.4]
JUNOS Routing Software Suite [14.1R6.4]
{master}
[email protected]> show version invoke-on other-routing-engine 
re1:
--------------------------------------------------------------------------
Hostname: RE1-MX480-02
Model: mx480
Junos: 14.1R6.4
JUNOS Base OS boot [14.1R6.4]
JUNOS Base OS Software Suite [14.1R6.4]
JUNOS Packet Forwarding Engine Support (M/T/EX Common) [14.1R6.4]
JUNOS Packet Forwarding Engine Support (MX Common) [14.1R6.4]
JUNOS platform Software Suite [14.1R6.4]
JUNOS Runtime Software Suite [14.1R6.4]
JUNOS Online Documentation [14.1R6.4]
JUNOS Services AACL Container package [14.1R6.4]
JUNOS Services Application Level Gateways [14.1R6.4]
JUNOS AppId Services [14.1R6.4]
JUNOS Services Captive Portal and Content Delivery Container package [14.1R6.4]
JUNOS Border Gateway Function package [14.1R6.4]
JUNOS Services HTTP Content Management package [14.1R6.4]
JUNOS Services Jflow Container package [14.1R6.4]
JUNOS IDP Services [14.1R6.4]
JUNOS Services LL-PDF Container package [14.1R6.4]
JUNOS Services MobileNext Software package [14.1R6.4]
JUNOS Services Mobile Subscriber Service Container package [14.1R6.4]
JUNOS Services NAT [14.1R6.4]           
JUNOS Services RPM [14.1R6.4]
JUNOS Services PTSP Container package [14.1R6.4]
JUNOS Services Stateful Firewall [14.1R6.4]
JUNOS Voice Services Container package [14.1R6.4]
JUNOS Services SSL [14.1R6.4]
JUNOS Services Crypto [14.1R6.4]
JUNOS Services IPSec [14.1R6.4]
JUNOS py-base-i386 [14.1R6.4]
JUNOS Kernel Software Suite [14.1R6.4]
JUNOS Crypto Software Suite [14.1R6.4]
JUNOS Routing Software Suite [14.1R6.4]

And with that, we have an upgraded Dual Routing Engine MX Series router! 😀 Yay! I’ll most likely, now that I’ve got the access, mess about with ISSU upgrade on MX next so keep an eye for that one!

Reference

Configuring Dual Routing Engines MX Series
Procedure to Upgrade JUNOS on a Dual Routing Engine System
Understanding Graceful Switchover (GRES)

Share this:
Share

Open Shortest Path First Notes

As part of studies, this post will be my notes on the Routing Protocol Open Shortest Path First

OSPF Basics

  • What is OSPF?
  • OSPF Structure

Neighbour Discovery

  • Inter-Node Communication
  • OSPF Packet Details
  • OSPF Hello Messages Details
  • Router-ID Selection Process
  • OSPF Neighbour Adjacency Process
  • Designated Router & Backup Designated Router
  • Designated Router Election

Network Types

  • Broadcast
  • Non-Broadcast Multi-Access
  • Point-to-Point
  • Point-to-Multipoint
  • Loopback

Scaling OSPF

  • Areas
  • Router Types
  • OSPF Route Types
  • Link-State Advertisement Types
  • Area Types

OSPF Basics

What is OSPF

Open Shortest Path First (OSPF) is an Open-Standard Interior Gateway Protocol (IGP) routing protocol. Unlike other Routing Protocols such as Routing Information Protocol (RIP), Enhanced Interior Gateway Routing Protocol (EIGRP) or Border Gateway Protocol (BGP), OSPF uses the Link State Algorithm in conjunction with Edsger W. Dijkstra Shortest Path First (SPF) algorithm to send out OSPF advertisements, known as Link-State Advertisements (LSAs), to share its Local Link-State Database (LSDB) with OSPF enabled devices to create an overall topology of every router, link state and link metric within a network. OSPF is defined in RFC2328:

OSPF is a link-state routing protocol. It is designed to be run internal to a single Autonomous System. Each OSPF router maintains an identical database describing the Autonomous System’s topology. From this database, a routing table is calculated by constructing a shortest-path tree.

OSPF recalculates routes quickly in the face of topological changes, utilizing a minimum of routing protocol traffic. OSPF provides support for equal-cost multipath. An area routing capability is provided, enabling an additional level of routing protection and a reduction in routing protocol traffic. In addition, all OSPF routing protocol exchanges are authenticated.

OSPF advertises and receives LSAs to/from neighbouring routers; these LSAs are stored with the router’s local LSDB. Whenever there is a change in the network new LSA’s will be flooded across the routing domain and all the routers will have to update their LSDB. This is due to the nature of the Link State and SPF Algorithms; essentially all OSPF routers have to same synchronized identical copy of the Link State Database to have a complete loop-free map of the network topology.

OSPF Structure

OSPF can be described as a two-tier hierarchical structure. This is because you have two main area types: Backbone Area and Non-Backbone Areas. The Backbone Area is known as Area 0 and Non-Backbone Areas are all other Areas. All Non-Backbone Areas MUST connect to Area 0. It is important to note, that OSPF routers in different Areas DO NOT have the same synchronized identical copy of each Link State Database however routers within the same Area will have an identical Link State Database. This is because; Area 0 provides transit for All Non-Backbone Areas. Non-Backbone Areas advertise their routes into Area 0 and Area 0 will advertise all routes learnt to the other Areas, as shown here

Neighbour Discovery

Inter-Node Communication

Communication between OSPF routers is done, dependent on network type, over IP using it own protocol number 89 sending multicast OSPF packets between each other. There are two multicast addresses that have been defined for OSPF enabled routers/interfaces to dynamically find neighbours. RFC2328 defines them as:

AllSPFRouters: This multicast address has been assigned the value 224.0.0.5. All routers running OSPF should be prepared to receive packets sent to this address. Hello packets are always sent to this destination. Also, certain OSPF protocol packets are sent to this address during the flooding procedure.

AllDRouters: This multicast address has been assigned the value 224.0.0.6. Both the Designated Router and Backup Designated Router must be prepared to receive packets destined to this address. Certain OSPF protocol packets are sent to this address during the flooding procedure.

OSPF Packet Details

As stated above, OSPF has it own dedicated IP protocol as reserved by Internet Assigned Number Authority (IANA) within the protocol, OSPF exchanges 5 types of packets:

Type Packet Name Packet Function
1 Hello Discovers and Maintains Neighbours
Hello are sent to ensure that neighbours are still available and online
2 Database Description
(DBD/DDP)
Summarize Database contents
When an adjacency is being formed, this packet will describe the content of the Link-State Database being received
3 Link-State Request (LSR) Database Download
These are used to request more detail about a portion of LSDB from one router to another, when some details are regarded as stale
4 Link-State Update (LSU) Database Update
This packet is normally in response to LSR packet, it provides an update to the LSDB as requested by a neighbour
5 Link-state Ack Flooding Acknowledgment
When the router receives a LSA flood, it will response to the flood to ensure OSPF reliable

OSPF Hello Messages Details

As stated earlier, an OSPF Packet will be exchanged between routers to allow them to have the same synchronizes OSPF database. For Adjacency discovery and maintenance; an OSPF Hello Message is flooded to all enabled interfaces, two routers that have the same matching hello messages will create an OSPF adjacency. The table below shows all the parameters that are within a Hello Message, with the first eight parameters needing to match for an adjacency to form:

Parameter Function
Hello Interval Amount of time between hello packets being sent and recieved
Dead Interval How long to wait between hello packets before marking the neighbour as dead, by default the dead interval is 4x the hello interval. Essentially, the router can miss for hello interval before updating that the neighbour is down
Area ID Both neighbour in the same OSPF Area.
Subnet Mask This is for connectivity both neighbours will need to be in the same subnet
Stub Area Flag This is for when the neighbour has been defined as Stub Area. Within OSPF all Areas that have been defined as Stub Areas mark their hello messages with the Stub Flag
Authentication Securing communication between neighbours. This can be configured with None, Clear Text or MD5
OSPF Router ID An unique 32-bit ID number that’s set in dotted-decimal format
Maximum Transmission Unit (MTU) As OSPF doesn’t support packet fragmentation, the MTU must be the same on both side.
From my experiences this is only changed if you are using Jumbo Packet sizing
Router Priority Used to determine Designated and Backup Designated Routers
Designated Router &
Backup Designated Router
The IP addresses of the Designated and Backup Designated Routers
Active Neighbours List of all the neighbours (the router) has recieved a Hello Message from, within the dead interval

OSPF uses its ALLSPFRouters address to send out hello messages across all OSPF enabled interfaces. It is important to add that if you have an interface that has been set as a passive OSPF interface, this interface will still be advertised into an OSPF routing domain however hello messages ARE NOT sent out. From my experiences this is commonly used on loopback address or external/customer facing interfaces. As you would want to advertise the subnet into OSPF however you wouldn’t want to have start an OSPF Neighbour Relationship between your ISPs or Customers.

The OSPF Router-ID is an important attribute when it comes to identifying a router within the OSPF domain. Each OSPF router has a Router-ID that is associated with the OSPF process, so it is possible to have to have two different processes active on single router with two different Router-IDs. The OSPF Router-ID has to be configured in 32-bit dotted decimal format, this is case whether you are using OSPFv2 (IPv4) or OSPFv3 (IPv4 and IPv6). As discussed in RFC2328

As each router will be getting an ID number, it is important to note, that these IDs have to be unique and no neighbour in the same OSPF domain can have same Router-ID. If two routers were to have same Router-ID, they wouldn’t be able to create a neighbour relationship. Additionally other neighbours peered with the both will have an issues with OSPF updates that come from the same Router-ID however the link-state databases are different, this can cause OSFP Flood War

OSPF Router-ID Selection Process

The process of selecting the Router-ID within OSPF follows this order:

  1. Hard Coding the Router-ID: If the Router-ID manually configured under the OSPF process this take precedence over everything. This is recommended and best practice
  2. Highest Logical IP Address: This will be the highest loopback address configured on the router
  3. Highest Active Physical IP address: This will be the highest IP address configured on a physical interface on the router

If you don’t hard code the router-id you will need to always remember, when you are making IP address updates on the router if you configure a new loopback or interface IP address that is higher than the currently OSPF Router-ID, it will change the Router-ID and can cause OSPF re-convergence, if the process is cleared or the device is reloaded.

OSPF Neighbour Adjacency Process

With OSPF, unlike, other IGPs has 2 Neighbour Adjacency states:
OSPF Neighbours: OSPF Neighbours are when two routers/devices have stop at the 2-Way neighbour state. At this state the neighbours bidirectional connectivity and all the OSPF parameters match. But it is important to note that the neighbours DO NOT exchange their link-state databases at this state.

OSPF Fully Adjacent Neighbours: OSPF Fully Adjacent Neighbours is when the two routers have the same bidirectional connectivity and all OSPF parameters match, however with Fully Adjacent Neighbours, each router will exchange their full link-state database with its neighbours and advertise the relationship in a link-state update packets.

Within OSPF there are 8 neighbour states that two neighbours can go through to become Fully Adjacent Neighbours. These states are:

State Description
Down This is the start state of neighbour communications. No Hello Messages have been exchanged
Attempt This state is valid only for Non-Broadcast Multi-Access (NBMA) networks. It is when a hello packet has not been received from the neighbour and the local router is going to send a unicast hello packet to that neighbour within the specified hello interval period.
Init The router has received a Hello Message from a neighbour, but has not received its own Router-ID from the neighbour. This means that Bidirectional communications have not been established yet.
2-Way Bidirectional communication between the neighbours have been established, no Link State information has been exchanged. At this state an OSPF Neighbourship has been created
ExStart This is where the neighbours start the process of becoming Fully Adjacent OSPF Neighbours and exchange Link State Databases
Exchange At this state, Link State Database details has been sent to the adjacent neighbour. At this state, a router is capable to exchange all OSPF routing protocol packets.
Loading At this state, the neighbour has exchanged its own LSDB, however has not fully requested/received LSA’s from its neighbour
Full Both LSDB’s have been exchanged and are fully synchronized. Each neighbour will have the full OSPF Network Topology available now

Designated Router & Backup Designated Router

OSPF has the concept of Designated and Backup Designated Routers (DR and BDR) for Multi-Access Networks that use technologies such as Ethernet and Frame Relay, as on the LAN you can have more than two OSPF enabled router. By having DR and BDRs, it assists in scalable of an OSPF segment, in addition to reducing OSPF LSA flooring across the network. This is because the other routers (OSPF DROthers) on the LAN, only create a Full OSPF Adjacency with the DR and BDR rather than with other DRothers. The DR is the solely responsible for flooding the LAN with LSA updates during a topology change. The flooring by the DR is controlled, as stated above, by the AllSPFRouters and AllDRouters multicast addresses. DR will flood LSAs to the AllSPFRouters destination address to communicate with other routers on the LAN; and DROthers will communicate their LSAs to DR and BDR using the AllDRouters destination address.

As the name suggests the BDR role is to be the secondary router in case the DR was the fail or be un-contactable, it will take over as the DR and another BDR will be elected. The BDR has a full OSPF Adjacency just like the DROthers with the BR, however unlike them, the BDR can listen on the ALLDRouters address. This means, in a situation of a DR failure, the BDR can take over as DR quicker and there will be less re-convergence across the network, as it already synchronized to the DR and the DROthers as they will all have the same LSDB.

Designated Router Election Method

The DR/BDR Election process is done during the 2-Way State, where bidirectional communications has been established between the routers and have received Hello Messages. OSPF uses Interface Priority and Router-ID to determine, which routers will be elected as DR and BDR. An OSPF router can have its interface priority set between 0-255, (an interface priority set to 0 means it is prohibited from entering DR/BDR election process) with the highest priority taking the role as the DR and the secondary highest priority becoming the BDR. If the priorities are all the same, the highest Router-ID will be used as the tiebreaker.

By default, OSPF’s priority is 1 on Cisco IOS/XR and 128 on Juniper. With Cisco IOS XR, you are able to set the priority for all interface within an area globally and under the interface, whereas Junos and Cisco IOS you can only set priority under the interface.

IMPORTANT NOTE
If an OSPF router receives a Hello Packet with the Router-ID for the DR or BDR isn’t 0.0.0.0, it will assume that DR and BDR have been elected already and will become a DROther.

Network Types

Depending on what the Layer-2 topology looks like within a network can have affect on the behaviour of OSPF. A Topology that uses Ethernet commonly allow multiple node on a LAN, in this case a Designated Router (DR) and Backup Designated Router (BDR) are used to cut down the OSPF LSA flooding, due to both supporting broadcast domains. Whereas other media such as serial links or Frame Relay don’t support broadcast domains meaning DR/BDR are not needed.

With this in mind OSPF has 5 different network types:

Broadcast

A Broadcast network is where an OSPF router is able to send a single message (broadcast message) that is able to communicate to more than 2 other OSPF routers on the same multi-access segment. i.e. Router A, B and C are connected to a Switch when Router A sends out a Hello Message it will be broadcasted across the segment via the Switch. With in this in mind, the need for DR/BDR will be required to control the LSA flooding across the segment. By default OSPF uses broadcast as the network type when configured on Ethernet LAN. The hello timers by 10/40 by default.

Non-Broadcast Multi-Access (NBMA)

This network type is used on links that do not support broadcast domain, media such as Frame Relay, ATM and X.25, or topologies like a hub and spoke where a router can connect to multiple nodes out of a single interface however isn’t fully meshed. A Non-Broadcast network will need to have DR/BDR configured, as you could have multiple nodes on the segment. However, Non-Broadcast network (as the name would suggest) doesn’t support broadcast or multicast, this means that OSPF’s normal way of sending hellos via the multicast address 224.0.0.5 to flood LAN looking for neighbours will not work. Instead it sends out unicast hello messages to statically configured neighbours. The hello timers are 30/120 by default.

Point-to-Point

This network type is commonly used when you only have two devices on the segment, ie if you have Router A connected to Router B using /31 or /30 that will be regarded as Point-to-Point (P2P) network. This network type doesn’t require DR/DBR as the two devices only have each other to communication and forming a DR/BDR would be a waste of Router resources. In addition, it important to note that P2P OSPF Adjacency form quicker as DR election is ignored and there is no wait timer. The hello timers by 10/40 by default and it supports OSPF Multicast Hello Messages.

Point-to-Multipoint

This network is commonly used when in a partially mesh network or hub and spoken network, where the Layer-2 topology doesn’t logically match the Layer-3 topology. I.e. in a hub and spoke or frame-relay network, Router A will be connected to Routers B and C, all on the same subnet, the Layer-3 will assume Routers B and C will be able directly connected on the same LAN, whereas the Layer-2 determines that Router B can only communicate with Router C by going via Router A. By using Point-to-Multipoint, it will advertise all each neighbour as a /32 endpoint forcing the Layer-3 routing to matches the Layer-2 by using Longest prefix match. The hello timers are 30/120 by default, doesn’t require DR/DBR and it supports OSPF Multicast Hello Messages.

Loopback

This network type is by default enabled on all loopback interfaces and can only be configured on loopback addresses. OSPF will always advertise loopback addresses as /32 route, even if the interface has been configured with a different prefix length. Hello messages, Timers and DR/BDR are not associated with Loopback network types.

Scaling OSPF

Areas

The wider a network gets, the wider OSPF domain will become. This can be an issue as all of these routers will need to maintain the same LSDB, and with a larger network more resources will be used processing LSA flooding and running SPF algorithm, which in turn will make the router run inefficient and possible start dropping packets. A way of easing this issue is to introduce OSPF Areas. OSPF Areas are used reduce the amount of the routers in a single area, in turn shrinking the LSDB size, restricts LSA flooding within/between areas, allows route summarization between Areas and increases SPF calculations. This is because routers maintain their own LSDB on a per-area basis. Essentially, Areas hide the their own topology and any LSA flooding or SPF calculations will same local to that area whilst the rest of the network stays unaware. Routers within the same area will have the same synchronized LSDB with Routers with interfaces in multiples area will hold LSDBs.

Router Types

Along with Area Types, OSPF has 4 different types of roles that an OSPF router could be, and dependent on the topology, multiple types at once. The table below describes the different Router types and you can see where each of these router types could sit within a simple topology here

Router Type Function
Backbone Router A router that is located and/or has a link(s) within Area 0 is known as a Backbone router. If this router has links to non backbone routers, it can also be known as an Internal router.
Internal Router An internal router is an OSPF router that only have links within a single area. If this router is within Area 0, it will also be known as Backbone Router.
Area Border Router (ABR) An Area Border Router (ABR) is a router that has links between 2 areas. ABRs are role is to inject routes from non-backbone areas into Backbone. For a router to be an ABR, it HAS to have a link to Area 0, if it doesn’t then it wont be an ABR. It is considered a member of all areas it is connected to. An ABR keeps multiple copies of the link-state database in memory, one for each area to which that router is connected.
Autonomous System Boundary Router (ASBR) An OSPF router that learns routes from external routing protocols (BGP, IS-IS, EIGRP, OSPF), Static Routes and/or both and injects them into OSPF via redistribution. ASBRs are special types of routers, as you have can ASBR that isn’t ABR as these ASBR functions are independent to ABR functions, but dependant on the topology, you could have router that is both an ASBR and ABR.

OSPF Route Types

OSPF has a unique relationship between how routes are exchanged between areas and how these routes are ranked in importance. There’s 3 types of the Routes that are exchanged within OSPF Inter-Area, Intra-Area and External Routes, and in regards with the External Routes, you have 2 different types of External Routes:

Intra-Area Routes: these are routes that are learnt from Routers that are within the same area. They are also known as internal routes
Inter-Area Routes: these are routes that have been learnt from different areas. These routes have been injected via an ABR. They are also known as summary routes.
External Routes: are routes that are learnt outside of the OSPF domain. These routes have been learnt via redistribution by an ASBR. External routes have 2 classifications Type 1 and Type 2.

  1. Type 1 Routes: Type 1 routes, metric value equals the Redistribution Metric + Total Path Metric. This means that the metric values will increase the further the route goes into the network from the injecting ASBR. Type 1 routes are also known as E1 and N1 External Routes
  2. Type 2 Routes: Type 2 routes, metric value is only the Redistribution Metric. This means that the metric value will stay the same, no matter the how far the route goes into the network (within in 30 hops) from the injecting ASBR. By default, type 2 is the metric type used by OSPF. Type 2 routes are also known as E2 and N2 External Routes
NOTE
The order of preference for these route types are as followed:

  1. Intra-Area
  2. Inter-Area
  3. External Type 1
  4. External Type 2

Link-State Advertisement Types

Devices in an OSPF domain use LSAs to build their local areas LSDB. These LSDBs are identical for devices in the same area and different areas and different router types can produce different type of LSAs. There is 11 types of LSAs however typically there are 6 LSAs that are commonly used and that should be known. These are:

Type 1 – Router

Every OSPF Router will advertise Type 1 Router LSA, these LSAs are used to essentially build the LSDB. Type 1 LSAs are entries that describe the interfaces and neighbours of each and every OSPF router within the same area. In addition, these LSAs ARE NOT forward outside its own area, making the intra-area topology invisible to other areas.

Type 2 – Network

A Type 2 Network LSA, are used over Broadcast OSFP domain with a DR. Network LSAs are always advertised by the DR and is used to identify all the routers (BDR and DRothers) across the multi-access segment. As with Type 1 LSAs, Network LSAs ARE NOT advertised outside of its own area, making the intra-area topology invisible to other areas.

Type 3 – Summary

Summary LSAs are the prefixes that are learnt from Type 1 and 2 LSAs and advertised by an ABR into other areas. ABRs DO NOT forward Type 1 and 2 LSAs to other areas, any Network and/or Router LSAs are received by an ABR, it will be converted into Type 3 LSA with Type 1 and 2 information referenced within. If an ABR receives a Type 3 LSA from a Backbone router, it will regenerate a new Type 3 LSA and list itself as the advertising router and forward the new Summary LSA to non-backbone area. This is how inter-area traffic is process via ABR.

Type 5 – External

An External Type 5 LSA are flooded throughout an OSPF domain when route(s) from another routing protocol is Redistributed via an ASBR. These LSAs are not associated to any area and are flooded unchanged to all areas, with the expectation to Stub and Not-So-Stubby Areas.

Type 4 – Autonomous System Boundary Router (ASBR) Summary

When a Type 5 LSAs is flooded to all areas, the next-hop information may not be available to other areas because the route(s) would have been redistributed from another routing protocol. To solve this ABR will flood the Router ID of the originating ASBR in a Type 4 ASBR Summary LSA. The link-state ID is the router ID of the described ASBR for type 4 LSAs. Essentially, any routes that are redistributed into OSPF, when, the first ABR receives the Type 5 LSA, it will generate and flood a Type 4 LSA.

Type 7 – Not So Stubby Area (NSSA) External

Routers in a Not-so-stubby-area (NSSA) do not receive external LSAs from Area Border Routers, but are allowed to send external redistributed routes to other areas. As ABR DO NOT advertise Type 7 LSAs outside of their local. The ABR will covert the Type 7 LSA into a Type 5 LSA and flood the Type 5 LSA across the OSPF domain, as normal.

In addition to the LSA types above, the other 6 LSA types that are within OSPF are:

  • Type 6 – Multicast Extension LSA
  • Type 8 – OSPFv2 External Attributes LSA, OSPFv3 Link-Local Only LSA
  • Type 9 – OSPFv2 Opaque LSA, OSPFv3 Intra-Area Prefix LSA
  • Type 10 – Opaque LSA
  • Type 11 – Autonomous System Opaque LSA

Types 9 – 11 are defined in RFC5250 and RFC2370. They are typically used as MPLS Traffic Engineering OSPF Extension. I personally, haven’t looked into as of yet however will update once I have done more reading into them.

Area Types

OSPF defines several special area types:

Backbone

As described earlier, the Backbone Area also know as Area 0, this is the most important area in OSPF and there always has to be a Backbone Area. The Backbone Area MUST connect to all areas, as non-backbone area have to use Area 0 as transit area to communicate to other non-backbone areas. This is because the Backbone has all the routing information inject into it and advertises them out. This design is important to prevent routing loops.

Stub Area

A Stub Area DOES NOT allow External Routes to be advertised within the area. This means when an ABR to a Stub Area receives a Type 5 (External) and Type 4 (ASBR Summary) LSAs, the ABR will generate a default route for the area as Type 3 Summary LSA.

Not So Stubby Area (NSSA)

A Not So Stubby Area are similar to Stub Areas as they DO NOT allow Type 5 External however unlike Stub Areas, Not So Stubby Areas DO redistributed external routes via an ASBR into the area. As described above when route is redistributed into the NSSA, a Type 7 NSSA External LSA is flooded throughout the area and once an ABR receives the Type 7 LSA, it is converted into a Type 5 LSA and flooded into other areas. It is important to add, by default the NSSA does not advertise a default route automatically when Type 5 or Type 7 LSAs are blocked by an ABR.

Totally Stubby Area (TSA)

A Totally Stubby Area DOES NOT allow any Inter-Area or External Routes to advertised with the area. Essentially, if a Type 3 Summary or Type 5 External LSA, by the ABR, it will generate default route and inject it to the area. Totally Stubby Areas only allow Intra-Area and Default Routes within the area. The only way for traffic to get routed outside of the area is a default route, which is the only Type-3 LSA, advertised into the area.

Totally Not So Stubby Area (TNSSA)

Totally Not So Stubby Areas DOES NOT permit Type 3 Summary, Type 4 ASBR and Type 5 External LSAs being received into the area. However just like a NSSA, it allows redistributed external routes into the area via an ASBR. Just like NSSA when route is redistributed into the NSSA, a Type 7 NSSA External LSA is flooded throughout the area and once an ABR receives the Type 7 LSA, it is converted into a Type 5 LSA and flooded into other areas, but unlike a NSSA when TNSSA ABR receives a Type 3 LSA from the backbone, it will automatically generate a default route and inject into the area.

Share this:
Share