Sunday, February 15, 2015

Linux Bonding Channel Driver, on Slackware P3

Basically, the bonding driver included in the stock 14.1 Slackware kernel (3.10.17), still supports using the ifenslave tool.

Unfortunately, there isn't a current slackbuild in either's Slackware official package repo nor the slackbuilds repository which has a Slackbuild for ifenslave.

So, you're left building it from scratch as described in the source file ifenslave.c (/usr/src/linux/Documentation/networking/ifenslave.c):

compile-command: "gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave"

When I was initially attempting to use the bonding driver, it was on my Poweredge 750 (arch: i686). For whatever reason, I could never get this source file to actually compile the binary ifenslave w/o encountering a build error.

This prompted me to look for an alternative. All the documentation I've seen in setting up bonding in Slackware walk you through this step first. In the major distros, ifenslave is either already included in the official repository, or bonding is integrated into the init script.

After some searching I finally came across an alternative: Vincent Batts posted a Slackbuild for ifenslave! It was hidden in the discarded directory, I believe it was for Slackware 12. It built just fine on my system!

(I later discovered what Vincent employed in his Slackbuild that actually allowed me to build ifenslave properly on my 14.1 x86 poweredge: the arch variable wasn't set via a case statement tied to uname--as is normally done-- it was hardcoded to 386.)

I was able to load the binary, initialize a bonded interface w/ ifconfig, and successfully utilized ifenslave to set the slave interfaces under the bond.

I've received Vincents permission to become the maintainer of the Slackbuild for the current release. I've modified it w/ enough core functionality to enable a Slackware installation to easily configure bonding. My difficulties in my journey to enable bonding in Slackware compelled me to create a solution that would bring this much needed functionality into Slackware into a simple installable package.

However, there is a caveat. Mainly, the current version of the Linux Bonding Driver (in the current stable kernel), no longer officially supports ifenslave--its use is now considered obsolete. This is sad, because the dev's reasoning is that a) most distro's include bonding functionality in the init scripts (a la RHEL), or b) in lack of such support you can always rely on creating a bond via the sysfs interface (/sys/class/net/bondX).

That being said, Slackware 14.1's kernel (3.10.17), still uses the older bonding driver, which supports ifenslave.

In fact--the current bonding driver doc mentions ifenslave will still function w/ the bonding driver and can be utilized. They just no longer support it and are in favor of the sysfs interface instead (which actually does provide more functionality in the bond initialization process).

So, I realize there is a need for this Slackbuild, although it will only be useful to a small audience (14.1). I'm not sure which kernel current is using, or if the bonding driver in that release still supports ifenslave or not.

Edit: I just checked kernel 3.14.29 (utilized at the moment in current); this kernel utilizes the updated bonding driver which does not officially support ifenslave.

To continue on though, ifenslave isn't the only thing we need to overcome in slackware. Sure, you could create a bond manually by simply:

 - initializing the bond via ifconfig
 - designating the slave interfaces via ifenslave
 - adding the route for the bonded interface into kernel routing table manualy

However, if you want a bonded interface to be created during the boot/startup process--because rc.inet1 includes no functionality for bonded interfaces--you will need to either add the bond initialization commands into rc.local, or the better solution in my opinion, create another rc.d init script. (Note: this will require executable permissions, i.e. chmod 775.)

#!/bin/sh
# rc.bond

        case "$1" in
          'start')
            echo "start bond0"
            #modprobe bonding mode=balance-alb miimon=100
            modprobe bonding mode=0 miimon=50
            modprobe tg3
            ifconfig bond0 up
            ifconfig bond0 192.168.0.210 netmask 255.255.255.0
            ifenslave bond0 eth0
            ifenslave bond0 eth1
            #TODO need to be changed
            ifconfig bond0 hw ether 00:16:3e:aa:aa:aa
        route add default gw 192.168.0.1 metric 1 bond0
          ;;
          'stop')
            ifconfig bond0 down
            rmmod bonding
            rmmod tg3
          ;;
          *)
            echo "Usage: $0 {start|stop}"
          ;;
        esac
#EOF


Credit for the original rc.bond script goes to Mehdi Sadighian.

The bonds' IP & Netmask, and the gateway IP, will need to be changed to reflect your network.

My only modifications were adding the route command to add the default gw.

Also: make sure you modify the HW mac address if you intend to:
 - make multiple bonds on the same system w/ this script
 - utilize this rc.bond script on multiple systems in the same LAN (or VLAN).

When I first tested this out, I had just initialized a bonded interface on both of my servers in my lan, however I was stumped when my attempt to ssh into either system would time out. After digging I realized the HW mac address was the same for both systems in each LAN. Once I ensured each system had a unique MAC, ssh returned to normal and everything was right once again in the LAN.

Apparently, 00:16:3e:aa:aa:aa is the XenSource MAC prefix. Xen recommends its use because it will not conflict w/ any known hardware mac address. I feel this reason merits leaving the prefix here.

I've experimented w/ modifying rc.inet1.conf to include a bonding statement but... it doesn't quite work right. For one, that isn't the proper file to modify, since its sourced by rc.inet1 (which is the rc script that should be modified). 2, every time you run the rc.inet1 script--to start, stop, restart, doesn't matter--because rc.inet1 always sources rc.inet1.conf, it will call rc.bond regardless of the subcommand you pass to it. So when I shutdown my network the rc.bond script would initiate.

I suppose a work around would be to modify the case statement to only source rc.inet1.conf during network initialization.. this may be something to consider, still its really only a hack. 

Honestly, even if the network stack stops, it isn't a problem if the rc.bond script initializes.. This can be seen as a good thing, you'll know none of the slave interfaces will accidentally be initialized w/ a ip address and route attached. (BTW--the documentation does list that as a known problem to watch for.)

However, as you've seen from my previous modifications to rc.inet1.conf, I've added sections for dns and hosts file settings, which now makes rc.inet1.conf more valuable to me. Valuable enough to not want initialize a bond w/o these settings being enabled.

So for now, if you want your system to startup w/ a bonded interface, the simplest solution is to create the rc.bond script I showed above, modify the bond to reflect your network (or add additional bonds), and finally add the following to rc.local:

/etc/rc.d/rc.bond start

In the slackbuild, I plan to include ifenslave tool, a sample rc.bond script to modify (like the one above), and a readme explaining the above.

In the future I will create a slackbuild that will actually utilize the sysfs interface instead, and thus be a supported method of bond creation, which will hopefully bring some cool features too.


Linux Bonding Channel Driver, on Slackware P2

So, I've been trying to get this to work for.. shit since October 2014, no success until now.

First, I tried bonding interfaces w/ my unmanaged HP Procurve2724 16 port gigabit switch. No Luck.

I figured I should look into a managed switch, hopefully LACP functionality will be present.

So I headed to my favourite place (Westech Recyclers--Resell Electronics), and picked up a D-Link DGS-1248T for a whopping 10 buckaroos. (Apparently a steal, this thing retailed for 300-600 new, holy cow.)

So I had a made a big misunderstanding in my research: the "trunking functionality" being described in the Linux Bonding Driver documentation as , well, compatible w/ 802.3ad!

What I didn't realize in my eagerness, I looked at the modes and chose 802.3ad (mode=4), as the one I wanted to use.

What I didn't realize, 802.3ad--which utilizes LACP--is a IEEE draft specification completely different than trunking.

In fact, I didn't know what Trunking really meant. I knew it was described in the Bonding Driver documentation, and that it was supported by my switch (DGS 1248T). When I reffered to the switches documentation, it wasn't descriptive in the least bit:

"The Trunk function enables the Switch to cascade two or more devices with larger bandwidths."

Devices? You Mean ports? Or Devices as in switches? Is this meant to trunk switches or ports? I was royally confused.

So here I was, spinning my wheels, for weeks, trying to get mode=4 w/ the DGS-1248T. I wasn't using VLANs, I simply enabled two groups of ports: one for my main system, and one for my other storage server.

Whenever I enabled the bond, I added the default gw route on the bond, but it would never reach the gateway. The command would time out. Actually it did work twice, but it would not work after reboots nor after disableing and re-enabling the bonded interface.

I was stumped. What is going on?

Finally, after reading a post on LQ forums regarding another users experience w/ this switch and bonded interfaces--specifically, his bonds were working! I thought--no way, how?

He wasn't using mode=4. He was using mode=2 (Balance-xor).

I decided to try it. In my system, I decided to use mode=0--Balance-rr. I was now able to add the default gw route on the bond! And I could communicate to the lan and internet!

I determined that the bond was initalized correctly and functioning on both of my servers (HP G5 ML350, and a poweredge 1850). Both bonded interfaces could send and recive on the lan and internet.

I still didn't understand why. It wasn't until I was looking at the Bonding Driver documentation this evening, that I realized why:

Requirements for specific modes:

... The Switch must be configured for "etherchannel," or "trunking," on the appropriate ports.

As I later discovered, etherchannel is Cisco's proprietary implementation of link aggregation which predated the 802.3ad spec. Furthermore, because etherchannel is cisco proprietary technology, the trunking variants seen in other mfg's switches must be that manufacturers implementation/etherchannel variant.

In other words, the only things they really lose is support for ISL and VTP, both Cisco proprietary technologies anyways.

The main reason I went through all this trouble was to determine first hand whether there was any measurable speed increase in using bonding.

I made a simple test, basically I transfered a large file over a single gigabit (via mounted nfs directory and using a simple copy).

As I've read--and discovered in my own experience--although the theoritical maximum transfer rate for a gigabit interface is 125MB/s, the real transfer rate is much slower. The reasons vary considerably, but generally mechanical drives are the main cause of the bottleneck (followed by system load or network traffic). In other tests that I've seen test a true gigabit transfer rate on a system, they utilized a ramdrive. So, for this test I created a 2 GB ramdisk via the newer tmpfs. I then copied the large file (1.5GB mp4), over the NFS mounted locally (hosted of second server also using bonded interface), to the mounted ramdisk filesystem (mounted as tmpfs and using tmpfs filesystem).

I then reran the same test, using the bonded interface w/ mode=2 (balance-rr).

The results did not dissapoint!

Transferred 1.5GB mp4 file over locally mounted nfs (hosted on secondary server), to tmpfs mounted filesystem:

1GB NIC Transfer:  50 MB/s

Bond0 (dual 1G NIC slaves): 94 MB/s

The bond transferred nearly twice as fast!

Linux Ethernet Bonding Driver, on Slackware P1

I've been wanting to utilize the Linux Ethernet Bonding Driver in my lab since I got my hands on a few managed switches that would support trunking. However, although I've bonded fiberchannel HBA network interfaces for our clients servers when I was a Linux Admin (all of the systems in the Phoenix AZ co-location were internetworked using 8g Fiberchannel interfaces. (Storage array's utilized 16G FC, as well as the core infrastructure switches which the clients environments were hosted on.)

However, this was mostly on RHEL 6 systems, some SLES, one (or two Ubuntu servers), and Oracle Enterprise Linux--Oracle's RHEL clone which now goes by Oracle Linux (probably to avoid confusion).

I have to say the majority of the systems I managed w/ RHEL, w/ Oracle Linux coming second. Most of the Oracle Linux systems were actually full blown virtualization servers, utilizing Oracle's proprietary virtualization platform--Oracle VM Server. (Unlike the opensource VirtualBox, VM Server is Oracle's customized version of Xen. In fact all of the Oracle Linux systems were running VM Server to host the clients systems.

Anyways, enough reminiscing. That didn't really help in my current endeavor. Although RHEL has functionality in the /etc/sysconfig/network file to recognize and initiate bonded interfaces, Slackware is not quite there.

In fact according to my findings, there is no built in functionality at all in the network init script--/etc/rc.d/rc.inet1.

So, taking a look at /etc/rc.d/rc.inet1, we can figure out what is going on when the network is initialized. Basically, the author created a few main functions: read network config file (rc.inet1.conf), create logs in system logfile (/var/log/messages), determine interface list (while loop w/ incremental array variable to differentiate between eth0 or eth1), loopback functions (initialize loopback interface), interface functions (the real enchilada of the script--br_open, br_close, if_up, and if_down), gateway function (route command initializes gateway defined in rc.inet1.conf), and finally the main function (case statement which defines the network scripts subcommands: start, restart, up, down, etc...

So, we have to look into the interface functions, specifically if_up. We really don't have to look at the whole thing line by line, simply skimming it will reveal that there is no functionality inherent in this script to initialize a bonded interface as is done in RHEL.

If Slackware is going to really be considered a Server OS, we need to modify this as soon as possible. I've hacked up my own work around, but am not confident in modifying the rc.inet1 network script enough that it will be fullproof (error free, or prone to wacky behavior). I'm willing to try (believe me I've started), but I feel this issue needs to be evaluated by Slackware team (Pat, Rob, Eric, Sudayo, I hope someones reading this.) This could easily be solved by adding function specifically for initializing a bonded interface in the interface functions, and even more important, that rc.inet1.conf can recognize a bonded interface and pass those variables to the main enchilada (rc.inet1).

I've been tried adding my own functions, but I keep overlooking some important steps.. mainly, how to differentiate between whether the script defines a bridge interface (must be created first), or a bonded interface, or if it wishes to bridge a set of bonded interfaces, or if there are no bridges and only bonds are used, to initialize those first while ensuring the slave interfaces are left disabled.

I've considered looking at redhats network init script for ideas, but as I've confessed, I am not confident any modifications I create would work properly.

I really hope the Slackware team considers this suggestion, and realize that it would benefit a large target audience--servers specifically--and aid in Slackware's adoption as a server OS. Especially w/ the RHEL crowd.

____

To be continued...