Tag Archives: layer2

Layer-2 VPNs on Junos

Reading Time: 7 minutes

It has been a busy few weeks trying to stay ahead of all the new work that has been coming towards myself and the team, due to the in sourcing of the core network! Lucky enough for my team, we have finally got our hands onto full end-to-end connectivity! Fun times 😀

With that being said, I’ve been given a wee project to provision a circuit for a business customer between two sites for a Proof Of Concept. As this circuit is being using as a POC (for now), it was agreed that a Layer 2 VPN (L2VPN/pseudowire) will be best suited, because a simple point-to-point connection was needed between two PEs. As we have a MPLS enabled network, it was decided that would be the easiest way to get their POC up and running quickly, as we were under a bit of a hard deadline!

For me, it was good little project, even though I know what L2VPNs were and how they work, I had never configured one myself. You see where I’m going with this now?

This post will over note how to configure L2VPN with Junos 😀

L2VPN, also known as a pseudowire, is defined in RFC4665, where they are called Virtual Private Wire Service (VPWS):

The PE devices provide a logical interconnect such that a pair of CE devices appears to be connected by a single logical Layer 2 circuit. PE devices act as Layer 2 circuit switches. Layer 2 circuits are then mapped onto tunnels in the SP network. These tunnels can either be specific to a particular VPWS, or be shared among several services. VPWS applies for all services, including Ethernet, ATM, Frame Relay, etc. Each PE device is responsible for allocating customer Layer 2 frames to the appropriate VPWS and for proper forwarding to the intended destinations.

In essence, L2VPNs are virtual point-to-point circuit that use the underlying Transport Labels (LDP/RSVP) or a statically defined MPLS path to go between two PE’s, that allows the extension of a layer 2 broadcast domain. If you need multiple sites on the same layer 2 broadcast you will need to consider Virtual Private Lan Service (VPLS) or Ethernet VPN (EVPN).

Within Junos there are 3 ways of configuring L2VPNs, two are regarded as modern way and has been rectified with RFC’s with an additional legacy method. Kompella and Martini are regarded as the industry standard, with Circuit Cross-Connect (CCC) seen as legacy:

  • Circuit Cross-Connect: The Circuit Cross-Connect style of L2VPN uses a single Outer Label, also known as the Tunnel/Transport Label, to transport L2 payload from PE to PE. CCC can ONLY use RSVP as MPLS transport, in addition each CCC connection has its own dedicated RSVP-signalled LSP associated, the transport label cannot be shared between multiple connections. LSPs are manually created on each PE to determines which circuit the frame belongs to on the other end.
  • Martini: The Martini style of L2VPN has a pair of labels before the L2 frame. The Outer label is the transport mechanism that allows the frame from egress interface from the sending PE to ingress interface of the receiving PE. The Inner label, known as the VC Label, is the label that informs the receiving PE, where the L2VPN payload should go. It is important to note that if you are using the Martini style, although either LDP or RVSP can be used MPLS transport, that LDP is used for the signalling of the VC label. So if the RSVP is used as the MPLS transport, LDP will need to be enabled on the loopback address of both PE routers. A minimum of 2 LSPs will need to be set, as MPLS LSPs are unidirectional.
  • Kompella: The Kompella style of L2VPN is similar to Martini style as both use stacked labels before the Layer 2 payload and both can use LDP, RSVP or both as Transport Label. There difference comes in that unlike Martini, Kompella uses BGP signalling as its VC Label. This means you will need to have BGP enabled network, in addition, it’s not compulsory to send static LSPs as BGP provides a mechanism for autodiscovery of new point-to-point links similar to a VPLS. Although Kompella has a more complex configuration, because of its usage of BGP signalling it is regarded as the best option for large scale deployments as it will in-conjunction with other BGP families. RFC6624 has more details on L2VPN using BGP for Auto-Discovery and Signaling

In our network, we use the Kompella style of L2VPNs. The bulk and most depth of my testing was with that method… Although I was able to get a wee bit of naughty time after to configure the other methods 🙂

The topology I’ll be working with is a simple one. I’ve a got a single MX480 broken up into 3 Logical Systems.

L2VPN Topology


The underlying IGP is IS-IS with RSVP, LDP and BGP enabled. This is a mirror, of what we have in production. With all the L2VPNs the customer facing physical interface has to be set to the correct encapsulation. For my testing, as I wont be using VLANs, Bridging or Setting a VPLS. I used ethernet-ccc and had set the logical interface to family ccc, you can find out more about the different physical encapsulations here

Interface ConfigRSVPMPLSBGPIS-ISLDP
set interfaces xe-0/1/0 enable
set interfaces xe-0/1/0 encapsulation ethernet-ccc
set interfaces xe-0/1/0 unit 0 family ccc
set protocols rsvp interface xe-1/0/0.0
set protocols rsvp interface xe-1/0/2.0
set protocols mpls explicit-null
set protocols mpls ipv6-tunneling
set protocols mpls no-decrement-ttl
set protocols mpls interface xe-1/0/0.0
set protocols mpls interface xe-1/0/2.0
set protocols bgp group Master type internal
set protocols bgp group Master local-address 192.168.2.1
set protocols bgp group Master family inet unicast
set protocols bgp group Master family inet6 unicast
set protocols bgp group Master local-as 100
set protocols bgp group Master neighbor 192.168.2.2 
set protocols bgp group Master neighbor 192.168.2.3
set protocols isis reference-bandwidth 1000g
set protocols isis level 1 disable
set protocols isis level 2 wide-metrics-only
set protocols isis interface xe-1/0/0.0 ldp-synchronization
set protocols isis interface xe-1/0/0.0 point-to-point
set protocols isis interface xe-1/0/0.0 link-protection
set protocols isis interface xe-1/0/2.0 ldp-synchronization
set protocols isis interface xe-1/0/2.0 point-to-point
set protocols isis interface xe-1/0/2.0 link-protection
set protocols isis interface xe-1/0/3.0 ldp-synchronization
set protocols isis interface xe-1/0/3.0 point-to-point
set protocols isis interface xe-1/0/3.0 link-protection
set protocols isis interface lo0.0
sset protocols ldp track-igp-metric
set protocols ldp explicit-null
set protocols ldp transport-address router-id
set protocols ldp interface xe-1/0/0.0
set protocols ldp interface xe-1/0/2.0
set protocols ldp interface lo0.0

All configurations will be done on the Master and SiteA, and for my examples I will show work done on the Master Instance. With all that out of the way… Let’s get cracking 😀

Kompella

As stated before, BGP is used as the VPN signalling method, with that in mind, we will need to enable layer-2 signalling within MP-BGP. This is simply done by adding the command family l2vpn signaling with the BGP stanza. This can be added globally within BGP or under the specific neighbour.

set protocols bgp group Master family l2vpn signaling

With the signalling sorted we can go straight into the configuration of the L2VPN. Just like L3VPNs, L2VPNs configuration is done within the routing-instance stanza and uses the same parameters as L3VPN by having Route Distinguisher (RD) and Route-Target/vrf-target (RT). The RD has to be unique per device with RT matching on all devices within the L2VPN, this is important, so that traffic can be routed accordingly per site. In addition, routing-instance has to be set to l2vpn and the interface(s) have to be defined within the routing-instance as well.

set routing-instances Master instance-type l2vpn
set routing-instances Master interface xe-0/1/0.0
set routing-instances Master route-distinguisher 100:0001
set routing-instances Master vrf-target target:100:0000

Next the properties for that site within the L2VPN will need to configured under protocol l2vpn within the routing-instance. The encapsulation has to match all site that want to participate within the VPN. The Site identifier must be unique to the entire site within the L2VPN as the site ID is used to compute label values for site-to-site communications. The interface(s) have to be defined within l2vpn and l2vpn site stanzas.

set routing-instances Master protocols l2vpn encapsulation-type ethernet
set routing-instances Master protocols l2vpn interface xe-0/1/0.0
set routing-instances Master protocols l2vpn site Master site-identifier 1
set routing-instances Master protocols l2vpn site Master interface xe-0/1/0.0
Full Kompella Configuration
set routing-instances Master instance-type l2vpn
set routing-instances Master interface xe-0/1/0.0
set routing-instances Master route-distinguisher 100:0001
set routing-instances Master vrf-target target:100:0000
set routing-instances Master protocols l2vpn encapsulation-type ethernet
set routing-instances Master protocols l2vpn interface xe-0/1/0.0
set routing-instances Master protocols l2vpn site Master site-identifier 1
set routing-instances Master protocols l2vpn site Master interface xe-0/1/0.0
set protocols bgp group Master family l2vpn signaling

Verification

The primary command that will be used to check the status of a pseudowire would be show l2vpn connections. As Komplella signalling uses BGP, we will be able to do a show bgp summary and see a route being advertised within the l2vpn and routing instance tables show route table Master.l2vpn.0 or show route table bgp.l2vpn.0 respectfully. Additionally we will be able to mpls.0 table to confirm that the L2VPN incoming label and interface(s) for the pseudowire have made the routing table, by using show route table mpls.0.

Show l2vpn Connectionsshow bgp summaryshow route table Master.l2vpn.0show route table mpls.0
[email protected]> show l2vpn connections    
Layer-2 VPN connections:

Legend for connection status (St)   
EI -- encapsulation invalid      NC -- interface encapsulation not CCC/TCC/VPLS
EM -- encapsulation mismatch     WE -- interface and instance encaps not same
VC-Dn -- Virtual circuit down    NP -- interface hardware not present 
CM -- control-word mismatch      -> -- only outbound connection is up
CN -- circuit not provisioned    <- -- only inbound connection is up
OR -- out of range               Up -- operational
OL -- no outgoing label          Dn -- down                      
LD -- local site signaled down   CF -- call admission control failure      
RD -- remote site signaled down  SC -- local and remote site ID collision
LN -- local site not designated  LM -- local site ID not minimum designated
RN -- remote site not designated RM -- remote site ID not minimum designated
XX -- unknown connection status  IL -- no incoming label
MM -- MTU mismatch               MI -- Mesh-Group ID not available
BK -- Backup connection	         ST -- Standby connection
PF -- Profile parse failure      PB -- Profile busy
RS -- remote site standby	 SN -- Static Neighbor
LB -- Local site not best-site   RB -- Remote site not best-site
VM -- VLAN ID mismatch

Legend for interface status 
Up -- operational           
Dn -- down

Instance: Master
  Local site: Master (1)
    connection-site           Type  St     Time last up          # Up trans
    2                         rmt   Up     Jun  4 12:36:46 2016           2
      Remote PE: 192.168.2.2, Negotiated control-word: Yes (Null)
      Incoming label: 800001, Outgoing label: 800000
      Local interface: xe-0/1/0.0, Status: Up, Encapsulation: ETHERNET

[email protected]> show bgp summary 
Groups: 1 Peers: 2 Down peers: 0
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
inet.0               
                       0          0          0          0          0          0
inet6.0              
                       0          0          0          0          0          0
bgp.l2vpn.0          
                       1          1          0          0          0          0
Peer                     AS      InPkt     OutPkt    OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
192.168.2.2             100       3234       3229       0       1  1d 0:19:17 Establ
  inet.0: 0/0/0/0
  inet6.0: 0/0/0/0
  Master.l2vpn.0: 1/1/1/0
  bgp.l2vpn.0: 1/1/1/0
192.168.2.3             100       5735       5724       0       1 1d 19:06:59 Establ
  inet.0: 0/0/0/0
  inet6.0: 0/0/0/0
  Master.l2vpn.0: 0/0/0/0
  bgp.l2vpn.0: 0/0/0/0/

[email protected]> show route table Master.l2vpn.0 

Master.l2vpn.0: 2 destinations, 2 routes (2 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

100:1:1:1/96                
                   *[L2VPN/170/-101] 1d 20:37:12, metric2 1
                      Indirect
100:2:2:1/96                
                   *[BGP/170] 00:01:02, localpref 100, from 192.168.2.2
                      AS path: I, validation-state: unverified
                    > to 192.168.1.14 via xe-1/0/0.0, Push 0
                      to 192.168.1.6 via xe-1/0/2.0, Push 300000

[email protected]> show route table mpls.0 protocol l2vpn    

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

800001             *[L2VPN/7] 23:30:09
                    > via xe-0/1/0.0, Pop       Offset: 4
xe-0/1/0.0         *[L2VPN/7] 00:06:20, metric2 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 800000 Offset: 252
                      to 192.168.1.6 via xe-1/0/2.0, Push 800000, Push 300000(top) Offset: 252

From the end host point of view, we have end-to-end connectivity 😀

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.431/0.637/0.843/0.206 ms
Note
The route given from the show route table Master.l2vpn.0 is the Route Distinguisher of the other end of the pseudowire

Martini

Martini signalling uses LDP, as stated before, and with LDP enabled already, I will focus on the actual configuration, which is done within the protocol l2circuit stanza. Compared to Kompella, the configuration for Martini style of L2VPNs is much simpler. All that is needed is for:

  • The remote neighbour to be defined. In my example I will be using the loopback address SiteA as the remote neighbour
  • The customer facing interface connecting into the VPN
  • Set a circuit ID, that must match on both sides

All this can be done in one line!

set protocols l2circuit neighbor 192.168.2.2 interface xe-0/1/0.0 virtual-circuit-id 1

With that we have Martini style L2VPN configured 🙂

Verifications

To check the status of Martini style L2VPN, you will use show l2circuit connections, the output is near enough the same as show l2vpn connections. Martini, as discussed above, uses LDP for the signalling, we will be able to use show ldp neighbor to check that the neighbour relationship with the remote side has been successful and we will be able to check the LDP database by using show ldp database to verify that new labels associated with the pseudowire (L2CKT) has been installed into the database. Additionally you can check the inet.3 and mpls.0 routing tables, by using show route table inet.3 & show route table mpls.0

Show l2circuit Connectionsshow ldp neighborshow ldp databaseshow route table inet.3show route table mpls.0
[email protected]> show l2circuit connections 
Layer-2 Circuit Connections:

Legend for connection status (St)   
EI -- encapsulation invalid      NP -- interface h/w not present   
MM -- mtu mismatch               Dn -- down                       
EM -- encapsulation mismatch     VC-Dn -- Virtual circuit Down    
CM -- control-word mismatch      Up -- operational                
VM -- vlan id mismatch		 CF -- Call admission control failure
OL -- no outgoing label          IB -- TDM incompatible bitrate 
NC -- intf encaps not CCC/TCC    TM -- TDM misconfiguration 
BK -- Backup Connection          ST -- Standby Connection
CB -- rcvd cell-bundle size bad  SP -- Static Pseudowire
LD -- local site signaled down   RS -- remote site standby
RD -- remote site signaled down  HS -- Hot-standby Connection
XX -- unknown

Legend for interface status  
Up -- operational            
Dn -- down                   
Neighbor: 192.168.2.2 
    Interface                 Type  St     Time last up          # Up trans
    xe-0/1/0.0(vc 1)          rmt   Up     Jun  5 14:03:37 2016           1
      Remote PE: 192.168.2.2, Negotiated control-word: Yes (Null)
      Incoming label: 300000, Outgoing label: 300016
      Negotiated PW status TLV: No
      Local interface: xe-0/1/0.0, Status: Up, Encapsulation: ETHERNET
      Flow Label Transmit: No, Flow Label Receive: No

[email protected]> show ldp neighbor      
Address            Interface          Label space ID         Hold time
192.168.2.2        lo0.0              192.168.2.2:0            43
192.168.1.6        xe-1/0/2.0         192.168.2.3:0            14
192.168.1.14       xe-1/0/0.0         192.168.2.2:0            13

[email protected]> show ldp database                             
Input label database, 192.168.2.1:0--192.168.2.2:0
  Label     Prefix
 299984      192.168.2.1/32
      0      192.168.2.2/32
 300000      192.168.2.3/32
 300016      L2CKT CtrlWord ETHERNET VC 1

Output label database, 192.168.2.1:0--192.168.2.2:0
  Label     Prefix
      0      192.168.2.1/32
 299968      192.168.2.2/32
 299984      192.168.2.3/32
 300000      L2CKT CtrlWord ETHERNET VC 1

Input label database, 192.168.2.1:0--192.168.2.3:0
  Label     Prefix
 300016      192.168.2.1/32
 300000      192.168.2.2/32
      0      192.168.2.3/32

Output label database, 192.168.2.1:0--192.168.2.3:0
  Label     Prefix
      0      192.168.2.1/32
 299968      192.168.2.2/32
 299984      192.168.2.3/32

[email protected]> show route table inet.3 192.168.2.2 

inet.3: 3 destinations, 4 routes (3 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

192.168.2.2/32     *[LDP/9] 1d 21:16:06, metric 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 0
                      to 192.168.1.6 via xe-1/0/2.0, Push 300000
                    [RSVP/10/1] 1d 01:14:11, metric 100
                    > to 192.168.1.6 via xe-1/0/2.0, label-switched-path to-siteA

[email protected]> show route table mpls.0 protocol l2circuit 

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

300000             *[L2CKT/7] 00:05:13
                    > via xe-0/1/0.0, Pop       Offset: 4
xe-0/1/0.0         *[L2CKT/7] 00:05:13, metric2 100
                    > to 192.168.1.14 via xe-1/0/0.0, Push 300016 Offset: 252
                      to 192.168.1.6 via xe-1/0/2.0, Push 300016, Push 300000(top) Offset: 252

From the end host point of view, connectivity between the two is there 🙂

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.358/0.532/0.707/0.176 ms

Circuit Cross-Connect

As CCC doesn’t support stacked labels unlike Kompella and Martini, we will need to configure 2 static LSPs between the PE routers. CCC needs to have a LSP for to transmit and another to receive traffic. So firstly, we will need to get the LSPs configured. The received LSP will be configured on the remote PE, so under protocols mpls label-switched-path stanza, this is where we will define the LSP. I've used the loopback address of the remote end with the underlying IGP working out the best path.

set protocols mpls label-switched-path to-siteA to 192.168.2.2
set protocols mpls label-switched-path to-siteA no-cspf

With the LSPs configured, we will need to go under the protocol connections stanza. We need to define the customer facing interface(s) that will be connecting into the VPN, then set the transmit LSP and receive LSP, this will be the name of the LSP set on the remote end.

set protocols connections remote-interface-switch siteA interface xe-0/1/0.0
set protocols connections remote-interface-switch siteA transmit-lsp to-siteA
set protocols connections remote-interface-switch siteA receive-lsp to-Master

With that we are sorted!

Verifications

In regards with CCC there's less show commands, from what I’ve found (let me know if there's more please), but we can check the pseudowire's status by using show connections. We can confirm the Transmit (Ingress) and Receive (Egress) LSP using show mpls lsp and finally, we will be able to mpls.0 table to confirm that the L2VPN incoming label and interface(s) for the pseudowire have made the routing table, by using show route table mpls.0.

Show Connectionsshow mpls lspshow route table mpls.0
[email protected]> show connections 
CCC and TCC connections [Link Monitoring On]
Legend for status (St):             Legend for connection types:
 UN -- uninitialized                 if-sw:  interface switching
 NP -- not present                   rmt-if: remote interface switching
 WE -- wrong encapsulation           lsp-sw: LSP switching
 DS -- disabled                      tx-p2mp-sw: transmit P2MP switching
 Dn -- down                          rx-p2mp-sw: receive P2MP switching
 -> -- only outbound conn is up     Legend for circuit types:
 <- -- only inbound  conn is up      intf -- interface
 Up -- operational                   oif  -- outgoing interface
 RmtDn -- remote CCC down            tlsp -- transmit LSP
 Restart -- restarting               rlsp -- receive LSP


Connection/Circuit                Type        St      Time last up     # Up trans
siteA                             rmt-if      Up      Jun  3 12:42:55           1
  xe-0/1/0.0                        intf  Up
  to-siteA                          tlsp  Up
  to-Master                         rlsp  Up

[email protected]> show mpls lsp                           
Ingress LSP: 1 sessions
To              From            State Rt P     ActivePath       LSPname
192.168.2.2     192.168.2.1     Up     0 *     to-siteA         to-siteA
Total 1 displayed, Up 1, Down 0

Egress LSP: 1 sessions
To              From            State   Rt Style Labelin Labelout LSPname 
192.168.2.1     192.168.2.2     Up       0  1 FF  300080        - to-Master
Total 1 displayed, Up 1, Down 0

Transit LSP: 0 sessions
Total 0 displayed, Up 0, Down 0

[email protected]> show route table mpls.0 protocol ccc    

mpls.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

300080             *[CCC/7] 00:00:04
                    > via xe-0/1/0.0, Pop      
xe-0/1/0.0         *[CCC/10/1] 00:00:04, metric 100
                    > to 192.168.1.14 via xe-1/0/0.0, label-switched-path to-siteA

Finally to confirm end-to-end reachability between the end hosts

[email protected]:~$ ping -c 2 -q 192.168.137.3
PING 192.168.137.3 (192.168.137.3) 56(84) bytes of data.

--- 192.168.137.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.358/0.532/0.707/0.176 ms

I had planned to have a wee bit more to this post, with what I was actually testing ,however, this is getting a bit longer than I expected, so I'll make this into a two-part 😉

My next post will detail, how you can use traffic engineering to manipulate a L2VPN path between 2 PE routers! Hope to see you there 😀

References

Darren's Blog L2VPN in Junos
RFC4665
MPLS l2VPN
RFC6624
RFC6074
Vlan based CCC L2vpn

Share this:
Share

IPv6 and Junos – VRRPv3

Reading Time: 5 minutes

With the Christmas season coming up, changes and the current course of projection slow down, so this is the perfect time to start messing around in the lab with things I wouldn’t normally get the chance to do. One thing I know will be happening in the future (but not the near future) on my work network is IPv6. We run IPv6 in the Internet Core however, it’s not to be seen anywhere else on the network, and I’ve heard they’re pushing hard to get IPv6 into our hosting datacentres. With this in mind, and having time to kill, it would be good to be proactive and start looking at how IPv6 and Junos work together!

From looking at the hosting and enterprise (a small bit of enterprise) network, I had a chat with a few of the seniors and we came up with list of things that we were most likely to be used on the network, and we agreed would be the best things to test:

Routing and Switching Features

  • VRRPv3
  • BGP
  • ACL
  • Virtual Routers (VRFs)
  • IGPs (OSPFv3, Static Routes & IS-IS)
  • SLAAC (Router Advertisements)
  • DHCPv6
  • Multicasting

Firewall Features

  • NAT64 / DNS64
  • Security Policies

Of course this list isn’t the be-all or end-all however, for now it’s a good base to get me started and from there we’ll see what happens next. Where I can, I’ll be mostly using IPv6 only, but there are a few features where I’ll have dual stacked setup as it will be good ‘real world’ testing! So with all that talk and explanation out of the way…. Let’s get cracking 😀

The first protocol on my list: Virtual Router Redundancy Protocol (VRRP). I’ve previously wrote a post on how to configure VRRP between Cisco and Juniper Switch, if you take look at that post it defines what VRRP is and why you would you use it within your network. When working with IPv6 (or in Dual Stacked environment) on the other hand you will need to make sure that we are using VRRPv3. VRRPv3 supports both IPv4 and IPv6 and can be defined best in RFC5798 for what the main advantages of VRRPv3:

The VRRP router controlling the IPv4 or IPv6 address(es) associated with a virtual router is called the Master, and it forwards packets sent to these IPv4 or IPv6 addresses. VRRP Master routers are configured with virtual IPv4 or IPv6 addresses, and VRRP Backup routers infer the address family of the virtual addresses being carried based on the transport protocol. Within a VRRP router, the virtual routers in each of the IPv4 and IPv6 address families are a domain unto themselves and do not overlap. The election process provides dynamic failover in the forwarding responsibility should the Master become unavailable. For IPv4, the advantage gained from using VRRP is a higher-availability default path without requiring configuration of dynamic routing or router discovery protocols on every end-host. For IPv6, the advantage gained from using VRRP for IPv6 is a quicker switchover to Backup routers than can be obtained with standard IPv6 Neighbor Discovery mechanisms.

For this test, I’ll have a similar topology as my other VRRP post, but I’ll be using 2x Juniper EX4200 switches and I’ll have ESXi Ubuntu 14.04LTS host configured with active-backup bond; the 2 physical NICs were connected into each switch.

VRRPv3 Topology

VRRP Configuration

Firstly will need to enable VRRPv3. By default VRRPv3 isn’t enabled and VRRPv2 doesn’t support inet6, you will need to have this enabled and is done under protocol vrrp stanza. In addition, as IPv6 doesn’t use Address Resolution Protocol (ARP) for Link Layer Discovery, we need to enable the IPv6 version of ARP, Neighbor Discovery Protocol (NDP). This will allow Neighbor Discoveries (ND) to be sent out to Host and other Network devices with that subnet with are needed to VRRPv3 to work affectively.

NOTE
For about IPv6 NDP check out RFC4861

ND is set under protocol router-advertisement stanza, and the logical interface set.

{master:0}[edit protocols]
[email protected]# show 
router-advertisement {
    interface vlan.100 {
        prefix 2001:192:168:1::/64;
}
vrrp {
    version-3;
}

Just like with VRRPv2 you will need to set the entire configuration under the interface stanza whether you have vlan or on physical interface. It is very important to note that you will need to manually set the link-local address on the interface and set a virtual link-local address (these both will need to in the same subnet) without these you will not be able to commit the configuration.

VRRP MasterVRRP Backup
{master:0}[edit interfaces vlan unit 100]
[email protected]# show 
family inet6 {
    address 2001:192:168:1::2/64 {
        vrrp-inet6-group 1 {
            virtual-inet6-address 2001:192:168:1::1;
            virtual-link-local-address fe80:192:168:1::1;
            priority 200;
            preempt;
            accept-data;
        }
    }
    address fe80:192:168:1::2/64;
}
{master:0}[edit interfaces vlan]
[email protected]# show 
unit 100 {
    family inet6 {
        address 2001:192:168:1::3/64 {
            vrrp-inet6-group 1 {
                virtual-inet6-address 2001:192:168:1::1;
                virtual-link-local-address fe80:192:168:1::1;
                priority 100;
                no-preempt;
                accept-data;
            }
        }
        address fe80:192:168:1::3/64;
    }
}

VRRP Verification

Depending on the level of detail you want to go into, you can run any of these commands show vrrp summary, show vrrp detail or show vrrp extensive. I checked both the Master and Backup to make sure everything was expected and differences between the two, by using show vvrp detail.

VRRP Master show vrrp detailVRRP Backup show vrrp detail
[email protected]> show vrrp detail    
Physical interface: vlan, Unit: 100, Address: 2001:192:168:1::2/64
  Index: 72, SNMP ifIndex: 709, VRRP-Traps: disabled, VRRP-Version: 3
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 200, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: yes, Accept-data mode: yes, VIP count: 2, VIP: fe80:192:168:1::1, 2001:192:168:1::1
  Advertisement Timer: 0.530s, Master router: fe80:192:168:1::2
  Virtual router uptime: 00:00:20, Master router uptime: 00:00:17
  Virtual Mac: 00:00:5e:00:02:01 
  Tracking: disabled

[email protected]> show vrrp detail 
Physical interface: vlan, Unit: 100, Address: 2001:192:168:1::3/64
  Index: 72, SNMP ifIndex: 709, VRRP-Traps: disabled, VRRP-Version: 3
  Interface state: up, Group: 1, State: backup, VRRP Mode: Active
  Priority: 100, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: no, Accept-data mode: yes, VIP count: 2, VIP: fe80:192:168:1::1, 2001:192:168:1::1
  Dead timer: 3.244s, Master priority: 200, Master router: fe80:192:168:1::2 
  Virtual router uptime: 00:00:28
  Tracking: disabled 

In addition, we can confirm that, from the VRRP Master, we are receiving ND’s from the ESXi host as we can see an entry when we run the command show ipv6 neighbors

[email protected]> show ipv6 neighbors 
IPv6 Address                 Linklayer Address  State       Exp Rtr Secure Interface
2001:192:168:1::3            cc:e1:7f:2b:82:81  stale       776 yes no      vlan.100    
2001:192:168:1::4            00:0c:29:d3:ac:77  stale       1070 no no      vlan.100   
fe80::20c:29ff:fed3:ac77     00:0c:29:d3:ac:77  stale       673 no  no      vlan.100    
fe80::20c:29ff:fed3:ac81     00:0c:29:d3:ac:81  stale       588 no  no      vlan.100    
fe80:192:168:1::3            cc:e1:7f:2b:82:81  stale       776 yes no      vlan.100

Failover Testing

Before testing the VRRP fail over, I enabled VRRP traceoptions on the master and backup, so that we will be able to see what’s happening under the bonnet. I found the logs from the backup were much simpler to understand compared to master however, on the master you were able to see what the VRRP daemon goes through the process of gaining mastership.

{master:0}[edit protocols vrrp]
[email protected]# show 
traceoptions {
    file vrrp.backup.log;
    flag all;
}

For the failover, the link down to the host and trunk link on the master were deactivated and from the logs on the VRRP Backup, we can see that VRRP daemon had received the vrrpd_process_ppmd_packet notifying that the VRRP master adjacency had gone down and then received another update ppmd_vrrp_delete_adj to remove the link-local address of the VRRP master and transition to become the VRRP Master.

Apr  2 14:43:03 vrrpd_process_ppmd_packet : PPMP_PACKET_ADJ_DOWN received
Apr  2 14:43:03 vrrpd_update_state_machine, vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 state: backup
Apr  2 14:43:03 vrrp_fsm_update IFD: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 event: transition
Apr  2 14:43:03 vrrp_fsm_transition: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 state from: backup
Apr  2 14:43:03 vrrp_fsm_update_for_inherit IFD: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 event: transition
Apr  2 14:43:03 ppmd_vrrp_delete_adj : VRRP neighbour fe80:192:168:1::2 on interface <72 1 1> deleted
Apr  2 14:43:03 vrrp_fsm_update IFD: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 event: master
Apr  2 14:43:03 vrrp_fsm_active: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 state from: transition
Apr  2 14:43:03 VRRPD_NEW_MASTER: Interface vlan.100 (local address 2001:192:168:1::3) became VRRP master for group 1 with master reason masterNoResponse
Apr  2 14:43:03 vrrpd_construct_pdu if: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001, checksum flag 0, checksum 17650
Apr  2 14:43:03 vrrpd_ppmd_program_send : Creating XMIT on IFL 72, Group 1, Distributed 0, enabled 1
Apr  2 14:43:03 vrrp_fsm_update_for_inherit IFD: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 event: master

When preempt has been configured on the Master, the two interfaces were reactivated, and it automatically takes over as VRRP Master. As we can see from the logs on the original backup switch, another vrrpd_process_ppmd_packet notification was received by the switch and the switch automatically transitions back to become VRRP Backup.

Apr  2 14:44:00 vrrpd_process_ppmd_packet : PPMP_PACKET_RECEIVE received
Apr  2 14:44:00 vrrp_fsm_update IFD: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 event: backup
Apr  2 14:44:00 vrrp_fsm_backup: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 state from: master
Apr  2 14:44:00 VRRPD_NEW_BACKUP: Interface vlan.100 (local address 2001:192:168:1::3) became VRRP backup for group 1
Apr  2 14:44:00 vrrpd_construct_pdu if: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001, checksum flag 0, checksum 17650
Apr  2 14:44:00 vrrpd_ppmd_program_send : Creating XMIT on IFL 72, Group 1, Distributed 0, enabled 0
Apr  2 14:44:00 vrrp_fsm_update_for_inherit IFD: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 event: backup
Apr  2 14:44:00 Signalled dcd (PID 1225) to reconfig
Apr  2 14:44:00 ppmd_vrrp_set_adj : Created adjacency for neighbor fe80:192:168:1::2 on interface <72 1 1> with hold-time <3 609000000>, Distributed 0
Apr  2 14:44:00 ppmd_vrrp_program_send : Programmed periodic send on interface <72 1 1> with enabled = 0, Distribute = 0, MASTER RE = 1
Apr  2 14:44:00 vrrpd_rts_async_ifa_msg, Received Async message for: (null) index: 72, family 0x1c op: 0x3 address : 2001:192:168:1::1
Apr  2 14:44:00 vrrpd_rts_async_ifa_msg, Received Async message for: (null) index: 72, family 0x1c op: 0x3 address : fe80:192:168:1::1
Apr  2 14:44:00 vrrpd_rts_async_ifl_msg, Received Async message for: vlan index: 72, flags 0xc000 op: 0x2
Apr  2 14:44:00 vrrpd_if_find_by_ifname_internal, Found vlan.000.100: vlan.000.100.001.2001:0192:0168:0001:0000:0000:0000:0003.001 in run 1
Apr  2 14:44:00 vrrpd_find_track_if_entry_array_by_name: vlan.100

In addition, you can check to see how many transitional changes were made by using the show vrrp extensive command:

[email protected]> show vrrp extensive | match Backup 
    Idle to backup transitions               :4         
    Backup to master transitions             :4         
    Master to backup transitions             :0

From the host point of view, I had a rolling ping going to gateway during the failover testing and the results were as expected.

--- 2001:192:168:1::1 ping statistics ---
123 packets transmitted, 95 received, 22% packet loss, time 122275ms
rtt min/avg/max/mdev = 1.058/96.186/3257.648/494.372 ms, pipe 4

Although you see packet loss, this is normal due to the bond type (active-backup) and one of the two NICs was unavailable. In addition, the connectivity never completely dropped out and having a host running on 50% capacity is better than a host with no accessibility.

The full logs and ping6 outputs are available here: vrrp.master.log, vrrp.backup.log, ping6.

VRRPv3 has very similar configuration to VRRPv2 but it took me a while to work out that without small differences, i.e. enabling router-advertisement and version-3 , you could be looking at the screen scratching your head! And with that we’ve got another post in the books. Keep an eye for future posts on IPv6 and Junos 🙂

Share this:
Share

VRRP Between Cisco and Juniper Switches

Reading Time: 3 minutes

For one of the many projects that I’ve been assigned at work, I got the chance to join the InfoSec Team and help design and configure their second site for their expanding network. Of course, any network engineer always wants to design and provision a network, they can call his/her own! So we were put on a plane and off to Sunny Glasgow, with a plan of attack and 4 days to get this first phase done.

To say it was a busy few days would be the understatement of the year, long days and nights on the data floor stacking, racking, patching and configuring. We had hard deadline to get everything configured and remotely accessible, so making sure the network was sorted was key! But one good thing was that the data floor was in one of our office buildings and it had a window! Inserts shameless instagram plug!

 

 
For those who haven’t worked in a dedicated datacentre, you wouldn’t understand how great natural light and view can be after 10 hours of work haha
In the end, phase one was completed on time (just), with everything working as expected. Inserts another shameless instagram plug

 


Missing from that post above was a Cisco 3750X that was used for vendor redundancy as part of the network. The guys had a HP c7000 Blade Chassis with 2 HP Virtual Connects Chassis Switches which needed to be connected to the edge switches, a Juniper EX4300 and the Cisco. This meant that I would have to span a vlan across two switches and share a default gateway between them. With this being the case, I had use a First-hop Redundancy Protocol (FHRP) and as I was using a multiple vendor topology, the FHRP of choice would have to be VRRP (Virtual Router Redundancy Protocol).

VRRP is best defined in RFC3768:

The Virtual Router Redundancy Protocol (VRRP) is designed to eliminate the single point of failure inherent in the static default routed environment. VRRP specifies an election protocol that dynamically assigns responsibility for a virtual router to one of the VRRP routers on a LAN. The VRRP router controlling the IP address(es) associated with a virtual router is called the Master, and forwards packets sent to these IP addresses. The election process provides dynamic fail-over in the forwarding responsibility should the Master become unavailable.

As VRRP is an open standard, it’s interoperable between both Cisco and Juniper devices. If it were just using Cisco devices, I would have had a choice between VRRP or HSRP (Hot Standby Router Protocol). HSRP works similar as VRRP but it’s a Cisco Proprietary Protocol, which means it’s only compatible between Cisco devices. You can see more detail on HSRP in RFC2281

Due to the upstream routing requirements and the EX4300 being higher specced switch, it was decided that the EX4300 was going to be the Master. The topology I was working with is shown below.

VRRP Topology
With that all explained, Let’s get cracking 😀

Juniper Configuration

Physical Interface ConfigurationIntegrated Routing & Bridging ConfigurationVlan Configuration
xe-0/2/3 {
    description "TRUNK to Edge Cisco";
    enable;
    unit 0 {
        family ethernet-switching {
            interface-mode trunk;
            vlan {
                members reith;
            }
        }
    }
}
irb {
    enable;
    unit 100 {                          
        enable;
        family inet {
            address 10.199.6.1/23 {
                vrrp-group 1 {
                    virtual-address 10.199.7.254;
                    priority 150;
                    no-preempt;
                    accept-data;
                }
            }
        }
    }
}
vlans {
    reith {
        vlan-id 100;
        l3-interface irb.100;
    }
}
Note
With the irb configuration, under the vrrp-group stanza, I had to add the command accept-data. Adding this command it will enable the master router to accept all packets destined for the Virtual IP (VIP) address. If this isn’t enabled when the EX4300 is set/becomes master, it will not respond to any packets sent to the VIP address!

Cisco Configuration

Physical Interface t1/1/2Routed VLAN Interface
egde-cisco#show run int t1/1/2 
Building configuration...

Current configuration : 137 bytes
!
interface TenGigabitEthernet1/1/2
 description "TRUNK to Edge Juniper"
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 100
 switchport mode trunk
end
egde-cisco#show run int vlan100
Building configuration...

Current configuration : 176 bytes
!
interface Vlan100
 ip address 10.199.6.2 255.255.254.0
 vrrp 1 description "TRUNK to Edge Juniper"
 vrrp 1 ip 10.199.7.254
 no vrrp 1 preempt
 vrrp 1 priority 145
end

Juniper Verification

Depending on the level of detail you want to go into, you can run of any of these commands show vrrp summary, show vrrp detail or show vrrp extensive. I mostly use show vrrp summary or show vrrp detail as ive found (most of time) that you get want you need from either useless you’ve had a big issue and extensive detail is needed!

Show VRRP SummaryShow VRRP Detail
[email protected]> show vrrp summary     
Interface     State       Group   VR state       VR Mode    Type   Address 
irb.100       up              1   master          Active    lcl    10.199.6.1         
                                                            vip    10.199.7.254
[email protected]> show vrrp detail       
Physical interface: irb, Unit: 100, Address: 10.199.6.1/23
  Index: 547, SNMP ifIndex: 567, VRRP-Traps: disabled, VRRP-Version: 2
  Interface state: up, Group: 1, State: master, VRRP Mode: Active
  Priority: 150, Advertisement interval: 1, Authentication type: none
  Advertisement threshold: 3, Computed send rate: 0
  Preempt: no, Accept-data mode: yes, VIP count: 1, VIP: 10.199.7.254       
  Advertisement Timer: 0.064s, Master router: 10.199.6.1
  Virtual router uptime: 19:40:12, Master router uptime: 19:40:04
  Virtual Mac: 00:00:5e:00:01:01 
  Tracking: disabled

Cisco Verification

On a Cisco, you can check VRRP status by running the command show vrrp

egde-cisco#show vrrp 
Vlan100 - Group 1  
"TRUNK to Edge Juniper"
  State is Backup  
  Virtual IP address is 10.199.7.254
  Virtual MAC address is 0000.5e00.0101
  Advertisement interval is 1.000 sec
  Preemption disabled
  Priority is 145 
  Master Router is 10.199.6.1, priority is 145 
  Master Advertisement interval is 1.000 sec
  Master Down interval is 3.433 sec

And with that we are done! Confirmed VRRP is working as expected! To be honest, before getting started I was a little worried that ill be running into plenty of issues running cross vendor but it was pretty straightforward, which is always good when you’re under the gun 🙂

Share this:
Share

Configuring a Virtual Chassis on QFX5100

Reading Time: 2 minutes

When configuring a 2 member Virtual Chassis using 2xQFX5100, there is a slight difference compared to EX Series but it’s very much similar. As this is the case, I thought I’d do a quick post (And it doubles up as documentation writing for work lol). The QFX doesn’t have dedicated VC ports or a VC module, so with this in mind you’ll have to use either 10GB SPF+ port(s) or 40GB QSFP(s) ports to connect the switches together. The method is same as Configuring Virtual Chassis on EX switch using VCEP ports, the one difference is that with the QFX you can do the entire configuration with the VCEP port pre-connected, but this wasn’t the case with EX Series. However, similarly, it’s recommended that you have Backup Routing Engine (RE) or Linecard powered off if you’re using the preprovisioned method like I am 🙂

Let’s get cracking 😀

In one of my previous post, Configuring Virtual Chassis on Juniper EX Series, it’s recommended that you have the following commands set before processing with configuring a Virtual Chassis:

set system commit synchronize
set chassis redundancy graceful-switchover
set routing-options nonstop-routing
set protocols layer2-control nonstop-bridging
Note
The QFX non-bridging command is under the protocols layer2-control stanza NOT ethernet-switching-options stanza as on the EX Series

Once these have been committed, you can configure the virtual-chassis stanza.

[email protected]> show configuration virtual-chassis | display set 
set virtual-chassis preprovisioned
set virtual-chassis no-split-detection
set virtual-chassis member 0 role routing-engine
set virtual-chassis member 0 serial-number TA3715110057
set virtual-chassis member 1 role routing-engine
set virtual-chassis member 1 serial-number TA3715110028

Just like on EX Series, you have to set the VC-Ports on the Master Routing Engine for them to know those ports are being used as the Virtual Chassis interconnects

[email protected]> request virtual-chassis vc-port set pic-slot 0 port 48 
[email protected]> request virtual-chassis vc-port set pic-slot 0 port 50

You can power up (the Backup RE), having completed the entire configuration on the Master RE. Once the Backup has booted, as done on the Master, you have to set the 40GB QSFPs ports as the VC-Ports

[email protected]> request virtual-chassis vc-port set pic-slot 0 port 48 
[email protected]> request virtual-chassis vc-port set pic-slot 0 port 50

Once this is done, the Virtual Chassis is created and you’ll be kicked out of the Backup RE and will have to log back into the switch, where you will be logged into the Master Routing Engine. To verify that everything is working as expected you can run the commands show virtual-chassis vc-port and show virtual-chassis

show virtual-chassis vc-portshow virtual-chassis
[email protected]> show virtual-chassis vc-port    
fpc0:
--------------------------------------------------------------------------
Interface   Type              Trunk  Status       Speed        Neighbor
or                             ID                 (mbps)       ID  Interface
PIC / Port
0/48        Configured          5    Up           40000        1   vcp-255/0/48
0/50        Configured          5    Up           40000        1   vcp-255/0/50

fpc1:
--------------------------------------------------------------------------
Interface   Type              Trunk  Status       Speed        Neighbor
or                             ID                 (mbps)       ID  Interface
PIC / Port
0/48        Configured          5    Up           40000        0   vcp-255/0/48
0/50        Configured          5    Up           40000        0   vcp-255/0/50
[email protected]> show virtual-chassis 

Preprovisioned Virtual Chassis
Virtual Chassis ID: 1165.24bd.5581
Virtual Chassis Mode: Enabled
                                                Mstr           Mixed Route Neighbor List
Member ID  Status   Serial No    Model          prio  Role      Mode  Mode ID  Interface
0 (FPC 0)  Prsnt    TA3715110057 qfx5100-48s-6q 129   Master*      N  VC   1  vcp-255/0/48
                                                                           1  vcp-255/0/50
1 (FPC 1)  Prsnt    TA3715110028 qfx5100-48s-6q 129   Backup       N  VC   0  vcp-255/0/48
                                                                           0  vcp-255/0/50

And you’re done! As I said, it’s very similar to configuring a Virtual Chassis on EX Series, except for a couple of small changes that could throw someone off if they didn’t know!

For more in-depth detail you can check Juniper’s TechLibrary page

Share this:
Share

Virtual Chassis Upgrade with Minimal Downtime

Reading Time: 6 minutes

At work we were looking to do a firmware upgrade of our junos going from 12.3 to 13.2X and we got a few VC switches. The plan was to use the NSSU method so that we didn’t get any downtime however, when doing testing I would kick off the NSSU and the backup member would upgrade, reboot and come up as expected:

{master:0}
[email protected]> ...p/jinstall-ex-4200-13.2X51-D35.3-domestic-signed.tgz    
Chassis ISSU Check Done
[Dec 18 04:32:13]:ISSU: Validating Image
[Dec 18 04:32:41]:ISSU: Preparing Backup RE
[Dec 18 04:32:42]: Installing image on other FPC's along with the backup

[Dec 18 04:32:42]: Checking pending install on fpc1
[Dec 18 04:33:41]: Pushing bundle to fpc1
NOTICE: Validating configuration against mchassis-install.tgz.
NOTICE: Use the 'no-validate' option to skip this if desired.
WARNING: A reboot is required to install the software
WARNING:     Use the 'request system reboot' command immediately
[Dec 18 04:34:42]: Completed install on fpc1
[Dec 18 04:34:53]: Backup upgrade done
[Dec 18 04:34:53]: Rebooting Backup RE

Rebooting fpc1
[Dec 18 04:34:54]:ISSU: Backup RE Prepare Done
[Dec 18 04:34:54]: Waiting for Backup RE reboot

After an hour of looking at this on the master, I consoled into the backup to see what had booted and was up, and I clearly had an issue. I aborted the NSSU and checked to see what was going; the backup member had upgraded and had connected with the master:

{master:0}
[email protected]> show version 
fpc0:
--------------------------------------------------------------------------
Hostname: EX4200-A
Model: ex4200-48t
JUNOS Base OS boot [12.3R5.7]
JUNOS Base OS Software Suite [12.3R5.7]
JUNOS Kernel Software Suite [12.3R5.7]
JUNOS Crypto Software Suite [12.3R5.7]
JUNOS Online Documentation [12.3R5.7]
JUNOS Enterprise Software Suite [12.3R5.7]
JUNOS Packet Forwarding Engine Enterprise Software Suite [12.3R5.7]
JUNOS Routing Software Suite [12.3R5.7]
JUNOS Web Management [12.3R5.7]
JUNOS FIPS mode utilities [12.3R5.7]

fpc1:
--------------------------------------------------------------------------
Hostname: EX4200-A
Model: ex4200-48t
JUNOS EX  Software Suite [13.2X51-D35.3]
JUNOS FIPS mode utilities [13.2X51-D35.3]
JUNOS Online Documentation [13.2X51-D35.3]
JUNOS EX 4200 Software Suite [13.2X51-D35.3]
JUNOS Web Management [13.2X51-D35.3]

I thought this was very odd so I checked the logs to see if anything was out of the norm and saw that VCP ports had come up however, the attempts to backup member had timed out :/

show log messages output
[email protected]> show log messages | last 100    
Dec 18 04:44:33  EX4200-A /kernel: tcp_timer_rexmt: Dropping socket connection due to error: 65
Dec 18 04:44:36  EX4200-A last message repeated 4 times
Dec 18 05:01:30  EX4200-A chassism[1280]: cm_ff_ifd_disable: fast failover disabled for internal-0/26
Dec 18 05:01:30  EX4200-A chassism[1280]: cm_ff_ifd_disable: fast failover disabled for internal-0/27
Dec 18 05:01:30  EX4200-A vccpd[1282]: ifl vcp-0.32768 set up, ifl flags 0, flags 1
Dec 18 05:01:30  EX4200-A vccpd[1282]: interface vcp-0 came up
Dec 18 05:01:30  EX4200-A chassism[1280]: cm_ff_ifd_disable: fast failover disabled for internal-1/26
Dec 18 05:01:30  EX4200-A vccpd[1282]: ifl vcp-1.32768 set up, ifl flags 0, flags 1
Dec 18 05:01:30  EX4200-A vccpd[1282]: interface vcp-1 came up
Dec 18 05:01:30  EX4200-A chassism[1280]: cm_ff_ifd_disable: fast failover disabled for internal-1/27
Dec 18 05:01:30  EX4200-A vccpd[1282]: Member 0, interface vcp-1.32768 came up
Dec 18 05:01:30  EX4200-A vccpd[1282]: Member 0, interface vcp-0.32768 came up
Dec 18 05:01:30  EX4200-A vccpd[1282]: Member 1, interface vcp-1.32768 came up
Dec 18 05:01:30  EX4200-A vccpd[1282]: Member 1, interface vcp-0.32768 came up
Dec 18 05:01:36  EX4200-A chassism[1280]: cm_ff_vcp_port_add: fast failover received VCP port add on dev 0 port 26
Dec 18 05:01:36  EX4200-A chassism[1280]: cm_ff_vcp_port_add: fast failover received VCP port add on dev 0 port 27
Dec 18 05:01:36  EX4200-A chassism[1280]: cm_ff_vcp_port_add: fast failover received VCP port add on dev 1 port 26
Dec 18 05:01:36  EX4200-A chassism[1280]: cm_ff_vcp_port_add: fast failover received VCP port add on dev 1 port 27
Dec 18 05:01:36  EX4200-A chassism[1280]: CM_CHANGE: Member 0->0, Mode M->M, 0M 1B, GID 0, Master Unchanged, Members Changed
Dec 18 05:01:36  EX4200-A chassism[1280]: CM_CHANGE: 0M 1B
Dec 18 05:01:36  EX4200-A chassism[1280]: CM_CHANGE: Signaling license service
Dec 18 05:01:36  EX4200-A chassism[1280]: mvlan_member_change_add: member id 1 (my member id 0, my role 1)
Dec 18 05:01:36  EX4200-A chassism[1280]: mvlan_ifl_create: Creating ifl, name bme0, subunit 32770
Dec 18 05:01:36  EX4200-A chassism[1280]: mvlan_rts_ifl_op: IFL idx is 8 is created
Dec 18 05:01:39  EX4200-A chassisd[1298]: CHASSISD_VERSION_MISMATCH: Version mismatch:   chassisd message version 2   FPC 1 message version 2   local IPC version $Revision: 590540 $   remote IPC version $Revision: 653007 $
Dec 18 05:01:42  EX4200-A license-check[1331]: LICENSE: copy to /config/license from fpc0:/config/.license_priv/
Dec 18 05:01:42  EX4200-A license-check[1331]: LIBJNX_REPLICATE_RCP_ERROR: rcp -r -Ji fpc0:/config/.license_priv/ /config/license : rcp: /config/.license_priv/: No such file or directory
Dec 18 05:01:42  EX4200-A license-check[1331]: LIBJNX_REPLICATE_RCP_ERROR: rcp -r -Ji fpc1:/config/.license_priv/ /config/license : rcp: /config/.license_priv/: No such file or directory
Dec 18 05:01:42  EX4200-A license-check[1331]: copy from member 0 failed
Dec 18 05:01:42  EX4200-A license-check[1331]: LICENSE: copy to /config/license from fpc1:/config/.license_priv/
Dec 18 05:01:42  EX4200-A license-check[1331]: copy from member 1 failed
Dec 18 05:01:50  EX4200-A bdbrepd: Subscriber Management is ready for GRES
Dec 18 05:01:52  EX4200-A license-check[1331]: LICENSE: copy to /config/license from fpc0:/config/.license_priv/
Dec 18 05:01:52  EX4200-A license-check[1331]: LIBJNX_REPLICATE_RCP_ERROR: rcp -r -Ji fpc0:/config/.license_priv/ /config/license : rcp: /config/.license_priv/: No such file or directory
Dec 18 05:01:52  EX4200-A license-check[1331]: copy from member 0 failed
{...}
Dec 18 05:02:39  EX4200-A chassisd[1298]: CHASSISD_FRU_ONLINE_TIMEOUT: fpc_online_timeout: attempt to bring FPC 1 online timed out
Dec 18 05:03:39  EX4200-A chassisd[1298]: CHASSISD_FRU_ONLINE_TIMEOUT: fpc_online_timeout: attempt to bring FPC 1 online timed out
Dec 18 05:03:39  EX4200-A chassisd[1298]: CHASSISD_FRU_UNRESPONSIVE: Error for FPC 1: attempt to bring online timed out; restarted it
Dec 18 05:03:39  EX4200-A chassisd[1298]: CHASSISD_FRU_OFFLINE_NOTICE: Taking FPC 1 offline: Restarting unresponsive board
Dec 18 05:03:39  EX4200-A chassisd[1298]: CHASSISD_IFDEV_DETACH_FPC: ifdev_detach_fpc(1)
Dec 18 05:03:39  EX4200-A chassisd[1298]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 7, jnxFruL1Index 2, jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName FPC: EX4200-48T, 8 POE @ 1/*/*, jnxFruType 3, jnxFruSlot 1)
Dec 18 05:03:40  EX4200-A chassisd[1298]: CHASSISD_VERSION_MISMATCH: Version mismatch:   chassisd message version 2   FPC 1 message version 2   local IPC version $Revision: 590540 $   remote IPC version $Revision: 653007 $

It was Friday and I had a planned upgrade for the following week, so I didn’t have the time to raise a JTAC case (which I should have probably done but that could come later). With this in mind I thought I should be able to manually failover the Routing-Engines and upgrade each member the same way without all of the magic of the NSSU:

NSSU Note
It took longer than expected to do this testing and I had to cancel my change. I found out that currently (as in when this was written) you can’t use NSSU to upgrade from 12.3 to any higher versions. This explained why everything was breaking and giving me issues. After raising this with our Technical Account Manager at Juniper, he provided details on What version Of Junos supports NSSU on EX Series.

Soooooooo this is what this post will be about, the success or failure of manually failing over a VC with minimal downtime 🙂

Let’s get cracking!

I was using 2x EX4200 with JUNOS 12.3R5.7; it’s the same setup I had in my previous Virtual Chassis post. I used the preprovisioned method of stacking the switches, and had the following VC specific configuration applied:

show routing-optionsshow chassisshow virtual-chassis
[email protected]# show routing-options 
nonstop-routing;
static {
    route 0.0.0.0/0 {
        next-hop 10.1.0.1;
        no-readvertise;
    }
}
[email protected]# show chassis 
redundancy {
    graceful-switchover;
}
[email protected]# show virtual-chassis 
preprovisioned;
no-split-detection;
member 0 {
    role routing-engine;
    serial-number BP0214340104;
}
member 1 {
    role routing-engine;
    serial-number BP0215090120;
}
fast-failover {
    ge;
    xe;
}

It’s important to make sure you have nonstop-routing, graceful-switchover and no-split-detection configured without these or you will most likely get a split brain affect and that’s not a good thing!

I’ve got a VM connected to both switches in LACP bond configured

[email protected]> show lldp neighbors 
Local Interface    Parent Interface    Chassis Id          Port info          System Name
ge-0/0/2.0         ae1.0               00:0c:29:4f:26:bb   eth1               km-vm1              
ge-1/0/2.0         ae1.0               00:0c:29:4f:26:bb   eth2               km-vm1

and I have the VM pinging it default gateway (192.31.1.1), which is the l3-interface on the switch

[email protected]:~$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.31.1.1      0.0.0.0         UG    0      0        0 bond0
10.1.0.0        0.0.0.0         255.255.255.0   U     0      0        0 eth0
192.31.1.0      0.0.0.0         255.255.255.0   U     0      0        0 bond0

Now everything is sorted, let’s try some stuff!

As the VM is dual connected to both members, I’ll shutdown the interfaces and the VCP ports of backup switch, upgrade it and then do the same on the master switch. In essence, I’ll be breaking the VC to upgrade each switch individually. I’ll be running a continuous ping from the VM switch and will be able see if any packets are dropped during this work.

I start with the backup member. I have to disable the data and break the virtual chassis by disabling the VCP ports. I had to copy over the junos package from member 0 to member 1, as I’d have no access to member 0 once the virtual chassis had been broken.

[email protected]> file copy /tmp/jinstall-ex-4200-13.2X51-D35.3-domestic-signed.tgz fpc1:/tmp/

This will copy the package from the member 0 to member 1. Confirmed by entering the shell cli and checking the /tmp folder on member 1

{backup:1}
[email protected]> start shell 
[email protected]:BK:1% cd /tmp/
[email protected]:BK:1% ls -la
total 234744
drwxrwxrwt   3 root  wheel           512 Dec 18 14:57 .
drwxr-xr-x  23 root  wheel           512 Dec 18 04:06 ..
-rw-r--r--   1 root  wheel            92 Dec 18 12:13 .clnpkg.LCK
-rw-r--r--   1 root  wheel            92 Dec 18 12:13 .pkg.LCK
drwxrwxr-x   2 root  operator        512 Dec 18 12:10 .snap
-rw-r--r--   1 root  wheel     120120669 Dec 18 14:58 jinstall-ex-4200-13.2X51-D35.3-domestic-signed.tgz
-rw-r--r--   1 root  wheel           393 Dec 18 12:10 partitions.spec
[email protected]:BK:1% exit

Next disable the member 1 port, in my case ge-1/0/2, deactivate interfaces ge-1/0/2

[email protected]# run show interfaces ge-1/0/2          
Physical interface: ge-1/0/2, Administratively down, Physical link is Down

The server dropped 3 packets, which is acceptable to most; so far so good. Next I disabled the VCP on the member 1 and member 0 and then console onto member 1.

[email protected]> request virtual-chassis vc-port set interface vcp-0 member 1 disable 
  [email protected]> request virtual-chassis vc-port set interface vcp-1 member 1 disable
  [email protected]> request virtual-chassis vc-port set interface vcp-0 disable 
  [email protected]> request virtual-chassis vc-port set interface vcp-1 disable

On member 1, it automatically took mastership and doesn’t member 0 anymore

{master:1}
[email protected]> show virtual-chassis status 

Preprovisioned Virtual Chassis
Virtual Chassis ID: e8a9.d27b.0f05
Virtual Chassis Mode: Enabled
                                           Mstr           Mixed Neighbor List
Member ID  Status   Serial No    Model     prio  Role      Mode ID  Interface
0 (FPC 0)  NotPrsnt BP0214340104 ex4200-48t
1 (FPC 1)  Prsnt    BP0215090120 ex4200-48t 129  Master*      N

The server is still pinging along, so now we can upgrade the backup member as if it was a standalone device. We’ll run request system software add /tmp/jinstall-ex-4200-13.2X51-D35.3-domestic-signed.tgz reboot validate reboot

Once member 1 rebooted I had to wait for a bit as it was looking for the master (due to the preprovisioned config) and it initial booted as a linecard however, it changed back to master after I entered the operational mode.

Next I enabled the member 1 port, activate interfaces ge-1/0/2

Comment
When I went to commit the change it took an awfully long time to activate the interface however, with a bit of patience the interface did come back up….. Eventually! Patience is the Key!

To double check and confirm it was up, I checked the lldp neighbor

[email protected]> show interfaces ge-1/0/2   
Physical interface: ge-1/0/2, Enabled, Physical link is Up
{master:1}
[email protected]> show lldp neighbors 
Local Interface    Parent Interface    Chassis Id          Port info          System Name
ge-1/0/2.0         ae1.0               00:0c:29:4f:26:bb   eth2               km-vm1

Now disable the member 0 port, in my case ge-0/0/2, deactivate interfaces ge-0/0/2

The Server had dropped 47 packets after the interface was disabled. This was most likely due to the convergence time for the LACP bond and the port going down, and this is shown in the log messages

Logs
Dec 18 17:01:19  EX4200-A /kernel: Percentage memory available(19)less than threshold(20 %)- 14
Dec 18 17:01:50  EX4200-A dcd[5164]: ae0 : Warning: aggregated-ether-options link-speed no kernel value! default to  0
Dec 18 17:01:50  EX4200-A dcd[5164]: check_prot: p_ae NULL, ifdp->ifdp_type is 25 ifdp_ifname ae1
Dec 18 17:01:50  EX4200-A mgd[3713]: UI_CHILD_EXITED: Child exited: PID 5164, status 1, command '/sbin/dcd'
Dec 18 17:03:27  EX4200-A mgd[3713]: UI_COMMIT: User 'root' requested 'commit' operation (comment: none)
Dec 18 17:03:34  EX4200-A /kernel: Percentage memory available(19)less than threshold(20 %)- 15
Dec 18 17:04:07  EX4200-A dcd[5205]: ae0 : Warning: aggregated-ether-options link-speed no kernel value! default to  0
Dec 18 17:04:07  EX4200-A dcd[5205]: ae1 : Warning: aggregated-ether-options link-speed no kernel value! default to  0
Dec 18 17:04:12  EX4200-A lldpd[1326]: UI_CONFIGURATION_ERROR: Process: lldpd, path: , statement: , Configuration database open failure: Database is already open
Dec 18 17:04:13  EX4200-A mgd[3713]: UI_DBASE_LOGOUT_EVENT: User 'root' exiting configuration mode
Dec 18 17:04:52  EX4200-A dcd[1297]: ae0 : aggregated-ether-options link-speed set to kernel value of  10000000000
Dec 18 17:04:52  EX4200-A dcd[1297]: ae1 : Warning: aggregated-ether-options has no childern! link-speed set to  0
Dec 18 17:04:52  EX4200-A /kernel: ae_bundlestate_ifd_change: bundle ae1: bundle IFD minimum links not met 0 < 1
Dec 18 17:04:52  EX4200-A mib2d[1304]: SNMP_TRAP_LINK_DOWN: ifIndex 655, ifAdminStatus up(1), ifOperStatus down(2), ifName ae1
Dec 18 17:04:52  EX4200-A /kernel: GENCFG: op 22 (Sflow) failed; err 1 (Unknown)
Dec 18 17:04:52  EX4200-A /kernel: drv_ge_misc_handler: ifd:135  new address:cc:e1:7f:2b:82:85
Dec 18 17:04:53  EX4200-A mib2d[1304]: SNMP_TRAP_LINK_DOWN: ifIndex 708, ifAdminStatus up(1), ifOperStatus down(2), ifName ae1.0
Dec 18 17:04:53  EX4200-A mib2d[1304]: SNMP_TRAP_LINK_DOWN: ifIndex 506, ifAdminStatus down(2), ifOperStatus down(2), ifName ge-0/0/2

With the server passing traffic over member 1, I could upgrade member 0 which was the same as before request system software add /tmp/jinstall-ex-4200-13.2X51-D35.3-domestic-signed.tgz reboot validate reboot

Same as member 1, it came back up after its reboot but the switch took an age to find the master and just as long to commit the activation of interface ge-0/0/2! Extreme Patience’s Needed!

Confirmation of the link is up and I have lldp neighbor

{master:0}
[email protected]> show lldp neighbors 
Local Interface    Parent Interface    Chassis Id          Port info          System Name
ge-0/0/2.0         ae1.0               00:0c:29:4f:26:bb   eth1               km-vm1
{master:0}
[email protected]> show interfaces ge-0/0/2   
Physical interface: ge-0/0/2, Enabled, Physical link is Up

Having both members are on the same code as expected:

{master:0}
[email protected]> show version 
fpc0:
--------------------------------------------------------------------------
Hostname: EX4200-A
Model: ex4200-48t
JUNOS EX  Software Suite [13.2X51-D35.3]
JUNOS FIPS mode utilities [13.2X51-D35.3]
JUNOS Online Documentation [13.2X51-D35.3]
JUNOS EX 4200 Software Suite [13.2X51-D35.3]
JUNOS Web Management [13.2X51-D35.3]

{master:1}
[email protected]> show version 
fpc1:
--------------------------------------------------------------------------
Hostname: EX4200-A
Model: ex4200-48t
JUNOS EX  Software Suite [13.2X51-D35.3]
JUNOS FIPS mode utilities [13.2X51-D35.3]
JUNOS Online Documentation [13.2X51-D35.3]
JUNOS EX 4200 Software Suite [13.2X51-D35.3]
JUNOS Web Management [13.2X51-D35.3]

To get them joined together into the virtual chassis I enabled the VCP ports on member 0 and hoped this would bring them back together with no issues (He says!!!)

{master:0}
[email protected]> request virtual-chassis vc-port set interface vcp-0    
{master:0}
[email protected]> request virtual-chassis vc-port set interface vcp-1

To finish off, I ran the command request system snapshot slice alternate all-members to make sure the backup partition image was consistent with the primary

And finally everything is complete! I confirmed the virtual-chassis, firmware version, lldp neighbors and Upgraded the Backup Partition! Never forget to do this!

show virtual-chassisshow versionshow lldp neighborsrequest system snapshot slice alternate
[email protected]> show virtual-chassis    

Preprovisioned Virtual Chassis
Virtual Chassis ID: e8a9.d27b.0f05
Virtual Chassis Mode: Enabled
                                                Mstr           Mixed Route Neighbor List
Member ID  Status   Serial No    Model          prio  Role      Mode  Mode ID  Interface
0 (FPC 0)  Prsnt    BP0214340104 ex4200-48t     129   Master*      N  VC   1  vcp-0      
                                                                           1  vcp-1      
1 (FPC 1)  Prsnt    BP0215090120 ex4200-48t     129   Backup       N  VC   0  vcp-0      
                                                                           0  vcp-1

[email protected]> show version              
fpc0:
--------------------------------------------------------------------------
Hostname: EX4200-A
Model: ex4200-48t
JUNOS EX  Software Suite [13.2X51-D35.3]
JUNOS FIPS mode utilities [13.2X51-D35.3]
JUNOS Online Documentation [13.2X51-D35.3]
JUNOS EX 4200 Software Suite [13.2X51-D35.3]
JUNOS Web Management [13.2X51-D35.3]

fpc1:
--------------------------------------------------------------------------
Hostname: EX4200-A
Model: ex4200-48t
JUNOS EX  Software Suite [13.2X51-D35.3]
JUNOS FIPS mode utilities [13.2X51-D35.3]
JUNOS Online Documentation [13.2X51-D35.3]
JUNOS EX 4200 Software Suite [13.2X51-D35.3]
JUNOS Web Management [13.2X51-D35.3]

[email protected]> show lldp neighbors 
Local Interface    Parent Interface    Chassis Id          Port info          System Name
ge-0/0/2.0         ae1.0               00:0c:29:4f:26:bb   eth1               km-vm1              
ge-1/0/2.0         ae1.0               00:0c:29:4f:26:bb   eth2               km-vm1
[email protected]> request system snapshot slice alternate  
fpc0:
--------------------------------------------------------------------------
Formatting alternate root (/dev/da0s1a)...
Copying '/dev/da0s2a' to '/dev/da0s1a' .. (this may take a few minutes)
The following filesystems were archived: /

fpc1:
--------------------------------------------------------------------------
Formatting alternate root (/dev/da0s2a)...
Copying '/dev/da0s1a' to '/dev/da0s2a' .. (this may take a few minutes)
The following filesystems were archived: /

From the running pings:

--- 192.31.1.1 ping statistics ---
9365 packets transmitted, 9234 received, +42 errors, 1% packet loss, time 9377278ms
rtt min/avg/max/mdev = 0.771/1.162/11.807/0.370 ms, pipe 3
[email protected]:~$

There was 1% packet over the whole time of the test (156 minutes), working out as a 93.77 second outage which isn't too bad. Considering this was the first time I tried this method I’ll be going over it again because it took far too long, but overall this method works!

I also messed about with the different types of bonding methods available:

With the round-robin or bond-type 0, the switch was configured as two access ports and I saw high packet loss during the testing.

--- 192.31.1.1 ping statistics ---
6106 packets transmitted, 3125 received, 48% packet loss, time 6128448ms
rtt min/avg/max/mdev = 0.814/1.484/902.641/16.131 ms

This was due to the nature of the round-robin bonding method.

Round-robin policy to transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.

With the active-backup or bond-type 1, the switch was configured as two access ports and I saw no packet loss during the testing. A sight difference when using active-backup (as expected to be honest) when you check the lldp neighbors is that you’ll only see one interface up at a time.

This is due to the nature of the bond-type

Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch.
Ping OutputLLDP Difference
--- 192.31.1.1 ping statistics ---
2905 packets transmitted, 2892 received, 0% packet loss, time 2908023ms
rtt min/avg/max/mdev = 0.846/1.214/20.269/0.758 ms
[email protected]> show lldp neighbors 
Local Interface    Parent Interface    Chassis Id          Port info          System Name
ge-0/0/2.0         -                   00:0c:29:4f:26:bb   eth1               km-vm1              
vme.0              -                   00:19:06:cd:8f:80   GigabitEthernet1/0/36 oob-sw0-10.lab      
xe-0/1/0.0         ae0.0               78:fe:3d:46:2a:c0   xe-0/0/2.0         EX4500

Having got a method that worked, the tabs below show some of the methods I tried and failed on. Looking back on some of the methods, the two methods I used were never going to work however, this is why you have a lab and it’s always good to see things for yourself to see if you can troubleshoot your way out! With all that being said I’ve actually picked up a few things I didn’t know, so this was a good exercise!

Tester Method #1
Upgrade the member 1 then see if you failover routing-engine from member 0 to member 1. The issue that could arise is that the routing-engine will not failover as the 2 switches will be on different version of code and the VC will not join back up as backup routing-engine.

I started the upgrade on member 1 running:

request system software add /tmp/jinstall-ex-4200-13.2X51-D35.3-domestic-signed.tgz member 1 reboot

Once the upgrade had completed, I checked the virtual chassis and as I thought member 1 didn’t join back into the VC as backup routing-engine

[email protected]> show virtual-chassis    

Preprovisioned Virtual Chassis
Virtual Chassis ID: e8a9.d27b.0f05
Virtual Chassis Mode: Enabled
                                           Mstr           Mixed Neighbor List
Member ID  Status   Serial No    Model     prio  Role      Mode ID  Interface
0 (FPC 0)  Prsnt    BP0214340104 ex4200-48t 129  Master*      N  1  vcp-0      
                                                                 1  vcp-1      
1 (FPC 1)  Inactive BP0215090120 ex4200-48t 129  Linecard     N  0  vcp-0      
                                                                 0  vcp-1

And the logs show the mismatch of code and timeout of member 1 rejoining the VC. This method is out, but then it was expected to be honest

Tester Method #2
Upgrade using NSSU method and when it gets stuck see if you can abort and failover. This method sounds like it’s a bit of a hack and won’t work however, we’re in the lab so it doesn’t matter, and if it works then yaaay!

I ran the command to kick of the NSSU

request system software nonstop-upgrade /tmp/jinstall-ex-4200-13.2X51-D35.3-domestic-signed.tgz
Chassis ISSU Check Done
[Dec 18 08:41:50]:ISSU: Validating Image
[Dec 18 08:42:20]:ISSU: Preparing Backup RE
[Dec 18 08:42:21]: Installing image on other FPC's along with the backup

[Dec 18 08:42:21]: Checking pending install on fpc1
[Dec 18 08:43:21]: Pushing bundle to fpc1
NOTICE: Validating configuration against mchassis-install.tgz.
NOTICE: Use the 'no-validate' option to skip this if desired.
WARNING: A reboot is required to install the software
WARNING:     Use the 'request system reboot' command immediately
[Dec 18 08:44:23]: Completed install on fpc1
[Dec 18 08:44:34]: Backup upgrade done
[Dec 18 08:44:34]: Rebooting Backup RE

Rebooting fpc1
[Dec 18 08:44:34]:ISSU: Backup RE Prepare Done
[Dec 18 08:44:34]: Waiting for Backup RE reboot

Having a console into member 1, I can see that member 1 has joined the VC cluster

{backup:1}
[email protected]> show virtual-chassis 

Preprovisioned Virtual Chassis
Virtual Chassis ID: e8a9.d27b.0f05
Virtual Chassis Mode: Enabled
                                                Mstr           Mixed Route Neighbor List
Member ID  Status   Serial No    Model          prio  Role      Mode  Mode ID  Interface
0 (FPC 0)  Prsnt    BP0214340104 ex4200-48t     129   Master       N  VC   1  vcp-0      
                                                                           1  vcp-1      
1 (FPC 1)  Prsnt    BP0215090120 ex4200-48t     129   Backup*      N  VC   0  vcp-0      
                                                                           0  vcp-1 

I aborted the NSSU on the member 0 console screen and tried to failover the routing-engine. However, when I aborted the NSSU it took over an hour to get the operational prompt and once I got to the operational prompt, the VC cluster had detached and was back to Master and Linecard. This makes sense now as the switches are out of the NSSU process, and it will just go back seeing 2 mismatched Junos versions

{master:0}
[email protected]> show virtual-chassis 

Preprovisioned Virtual Chassis
Virtual Chassis ID: e8a9.d27b.0f05
Virtual Chassis Mode: Enabled
                                           Mstr           Mixed Neighbor List
Member ID  Status   Serial No    Model     prio  Role      Mode ID  Interface
0 (FPC 0)  Prsnt    BP0214340104 ex4200-48t 129  Master*      N  1  vcp-0      
                                                                 1  vcp-1      
1 (FPC 1)  Inactive BP0215090120 ex4200-48t 129  Linecard     N  0  vcp-0      
                                                                 0  vcp-1

This method is out (but then this was expected)

Side Notes
Note 1Note 1.5Rollback Firmware Upgrade
If you want to change it you can release the routing-engine mastership by running the command request chassis routing-engine master release, but by using this command you will get an outage as the PFE will switchover and as you can not have non-stop routing and graceful switchover, both configured under routing-options stanza an outage will happen.
Additionally, whatever config changes you made as the members are separated will not be kept if you switchover the PFE. I saw that when I enabled interface ge-1/0/2 on member 1, but when the PFE was switchover it become inactive.
[email protected]> request system software rollback member 1 reboot 
fpc1:
--------------------------------------------------------------------------
Junos version '12.3R5.7' will become active at next reboot
Rebooting ...
shutdown: [pid 1280]
Shutdown NOW!

Then once member 1 has rebooted, I checked to make sure it is present into the virtual chassis

[email protected]> show virtual-chassis                                

Preprovisioned Virtual Chassis
Virtual Chassis ID: e8a9.d27b.0f05
Virtual Chassis Mode: Enabled
                                           Mstr           Mixed Neighbor List
Member ID  Status   Serial No    Model     prio  Role      Mode ID  Interface
0 (FPC 0)  Prsnt    BP0214340104 ex4200-48t 129  Master*      N  1  vcp-0      
                                                                 1  vcp-1      
1 (FPC 1)  Prsnt    BP0215090120 ex4200-48t 129  Backup       N  0  vcp-0      
                                                                 0  vcp-1
Share this:
Share