As I was doing testing in my previous post, I ran into an issue where I had configured bond-type 1 (Active-Backup) interface however Active Slave never failed over when I disconnected the interface. For the life of me, I didn’t have a clue why! Subsequently, I found out that the configuration I had on ESXi host’s vSwitch was wrong and this is why the failover never happened.
Before I told about the ESXi vSwitch, I was looking at a number of different ways to fix this issue. From my searching I found a great article written by Ivan Erben on how you can manually fail over active slave in bond-type 1 configuration
It was quite straightforward, as I like it :p
Firstly, check to see what the active slave is by using the command cat /proc/net/bonding/bond0
[email protected]:~$ cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: eth1 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth1 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:0c:29:4f:26:c5 Slave queue ID: 0 Slave Interface: eth2 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:0c:29:4f:26:cf Slave queue ID: 0
Having seen that eth1 is the active slave, we can remove the interface from the bond, by running echo -eth1 > /sys/class/net/bond0/bonding/slaves
[email protected]:~$ sudo -s [sudo] password for marquk01: [email protected]:~# echo -eth1 > /sys/class/net/bond0/bonding/slaves
We can see that the eth1 has been removed from bond configuration
[email protected]:~# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: eth2 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth2 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:0c:29:4f:26:cf Slave queue ID: 0
The bond will still pass traffic and work as expected to add the interface back into the bond, we would need to run echo +eth1 > /sys/class/net/bond0/bonding/slaves
As we can see, eth1 has been added back into the bond and eth2 has become the active slave.
[email protected]:~# echo +eth1 > /sys/class/net/bond0/bonding/slaves [email protected]:~# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: eth2 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth2 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:0c:29:4f:26:cf Slave queue ID: 0 Slave Interface: eth1 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:0c:29:4f:26:c5 Slave queue ID: 0
This is very useful, if you know you have planned maintenance or need a quick failover of interfaces and you don’t have link detection enabled. Definitely a great find and post by Ivan! You can check out his blog here